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20.  ABSTRACT 


We  would  like  to  build  story  understanding  systems  which  are  transparent,  modular,  and 
extensible.  To  this  end,  we  have  been  working  on  a  new  logical  approach  to  narrative 
understanding  that  features  a  GPSG-style  grammar  and  an  episodic  logic  with  probabilistic 
inference  rules.  The  grammar  represents  phrase  structure  and  the  relationship  between  phrase 
structure  and  logical  form  in  a  modular,  explicit  form.  The  logical  representation  allows 
propositional  attitudes,  unreliable  generalizations,  and  other  non-standard  constructs,  providing  a 
uniform,  transparent  knowledge  representation  for  both  the  explicit  content  of  stories  and  for  the 
background  knowledge  needed  to  understand  them.  It  makes  systematic  use  of  episodic  variables 
in  the  representation  of  episodic  sentences,  using  these  to  capture  temporal  and  causal 
relationships.  The  rules  of  inference  include  probabilistic  versions  of  deduction  rules  resembling 
forward  and  backward  chaining  rules  in  expert  systems.  These  can  be  used  for  predictive, 
explanatory,  and  simulative  inference.  We  illustrate  our  approach  with  nontrivial  grammar 
fragments  (including  semantic  rules),  and  with  an  extended  example  of  forward-chaining  inference 
based  on  a  sentence  from  Little  Red  Riding  Hood.  A  pilot  implementation  is  able  to  make  many 
(though  not  all)  of  the  inferences  we  describe. 
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1  Introduction 


Many  ideas  and  systems  have  been  developed  for  narrative  understanding,  and  some 
of  them,  e.g.,  CYRUS  (Kolodner  1981)  and  BORIS  (Lehnert  et  al.  1983,  Dyer  1983), 
have  shown  a  remarkable  degree  of  understanding  in  complicated  human  domains.  These 
ambitious  systems,  however,  are  very  complex  and  hard  to  extend  beyond  the  few  stories 
they  handle.  We  believe  this  is  so  because  of  insufficient  expressiveness  and  clarity  of 
the  knowledge  representations  used,  for  knowledge  about  language  as  well  a &  about  the 
world,  and  about  inferences  warranted  by  that  knowledge. 

For  example,  the  meaning  representations  used  often  cannot  express  complex  quan¬ 
tification  (“most  people  with  two  or  more  cars”),  logical  compounding  (“If  he  fails,  he  is 
either  lazy  or  a  fool”),  complex  concepts  (“the  type  of  person  who  never  forgets  a  slight”), 
modification  (“a  nearly  invisible  pale  brown  birthmark”),  temporal  relations  (“He  had 
seen  her  twice  the  previous  week”),  and  so  on.  As  well,  knowledge  about  language  and 
about  the  world  are  often  buried  in  procedures  (e.g.,  procedures  which  seek  semantically 
appropriate  fillers  for  frame  slots)  in  a  way  that  makes  it  very  hard  to  determine  what 
linguistic  and  factual  assumptions  have  been  made. 

These  considerations  have  led  us  to  an  approach  to  narrative  understanding  in 
which  all  types  of  linguistic,  world  and  inference  knowledge  are  represented  in  an  ex¬ 
plicit,  analyzable  form.  We  have  chosen  Generalized  Phrase  Structure  Grammar  (GPSG; 
Gazdar  et  al.  1985)  as  our  grammatical  representation  and  have  been  developing  a 
knowledge  representation,  called  episodic  logic  (a  kind  of  situational  logic,  c/.,  Bar- 
wise  and  Perry  1983,  Barwise  1989),  for  encoding  both  the  content  of  narratives  and  the 
knowledge  needed  to  understand  them.  GPSG  is  a  particularly  perspicuous  grammatical 
formalism  which  is  expressively  adequate  for  almost  all  English  grammatical  phenom¬ 
ena,  and  is  relatively  easy  to  use  by  a  parser  and  logical-form  generator.  Episodic  logic 
is  expressively  rich  and  close  enough  to  surface  form  so  that  the  relationship  between 
surface  form  and  logical  form  can  be  specified  in  a  modular,  transparent  way.  It  intro¬ 
duces  episodic  variables  so  that  implicit,  context-dependent  relationships  among  episodes 
(events,  situations,  circumstances,  eventualities,  etc.)  can  be  made  explicit.  It  also  allows 
the  representation  of  restricted  quantifiers,  propositional  attitudes,  predicate  modifiers, 
nominalized  predicates,  and  perhaps  most  importantly,  unreliable  generalizations.  Such 
generalizations  have  recently  received  much  attention  in  the  non-monotonic  reasoning 
literature  and  elsewhere  (e.g.,  linguistic  semantics).  The  practical  adequacy  of  our  logic 
has  been  tested  on  small  story  fragments. 

In  the  next  section,  we  motivate  and  explain  some  of  the  unusual  features  of  our 
logic.  In  section  3,  we  provide  preliminary  semantics.  While  our  main  concern  in  this 
paper  is  with  representation  and  inference,  we  need  to  make  plausible  the  claim  that 
linguistic  input  could  be  mapped  into  our  logical  representation  in  a  principled  and 
transparent  fashion.  Therefore,  in  section  4,  we  sketch  the  derivation  of  episodic  logical 
form  from  surface  structure  using  a  GPSG  grammar.  Next,  in  section  5,  we  introduce 
some  inference  rules  and  indicate  their  role  within  our  implementation.  In  section  6,  we 
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illustrate  the  inference  process  in  story  understanding  with  an  extended  example  based 
on  a  small  fragment  of  Little  Red  Riding  Hood.  In  the  concluding  section,  we  comment 
on  related  work  and  assess  the  progress  made  and  work  still  to  be  done. 


2  Episodic  Logic 

In  o  rr  approach  to  interpreting  English  text,  the  representation  of  an  input  sentence  is 
obtained  in  several  (possibly  interleaved)  processing  stages.  Initial  representations  are 
in  general  ambiguous  (e.g.,  with  respect  to  the  scopes  of  quantifiers  and  other  operators) 
and  context-dependent  (e.g.,  involving  indexical  operators  like  past,  whose  interpretation 
depends  r  ■  the  utterance  time).  The  various  processing  stages  are  aimed  at  removing 
ambiguit/.  aid  context-dependence.  We  will  loosely  refer  to  the  outputs  of  any  of  the 
stages  as  “logical  forms.”  However,  the  emphasis  in  this  section  is  on  the  final  logical 
forms,  i.e.,  the  context-independent  representations  which  are  ultimately  “committed  to 
memory,”  ready  for  use  in  inferential  processes. 

Our  initial  logical  form  closely  follows  surface  structure,  and  even  the  fined  form 
is  rather  English-like.  It  “mimics”  noun  phrases  in  its  use  of  restricted  quantifiers,  and 
follows  English  sentence  syntax  by  having  the  “subject”  of  a  predication  precede  the 
predicate.  For  example,  “Every  dog  has  a  tail”  would  be  represented  (minus  tense)  as 

(1)  (Vx:[x  dog]  (3 y:[y  tail]  [x  have-as-part  y])) 

after  predicate  disambiguation  and  quantifier  scoping.1  (The  initial  representation  com¬ 
puted  from  the  surface  form  would  be  [<V  dog>  have  <3  tail>].) 

Two  more  features  that  lead  to  close  conformity  between  the  surface  form  and  our 
logical  form  are  predicate  modification  and  A-abstraction.  These  are  illustrated  by  (2) 
and  (3),  with  tense  again  neglected  (and  with  the  phrase  in  (3a)  represented  in  unscoped 
form  in  (3b)): 

(2)  a.  Canada  is  very  distant  from  Australia 

b.  [Canada  (very  Ax[x  distant-from  Australia])]; 

(3)  a.  the  brother  of  Mary  who  is  a  doctor 

b.  <The  Ax[[x  brother-of  Mary]  A  [x  doctor]]  >. 

In  (2b),  the  predicate  modifier  very  is  a  function  which,  when  applied  to  a  predicate, 
yields  another,  more  restricted  predicate.  Other  examples  of  predicate  modifiers  are 
almost,  fake,  coll  (forming  predicates  over  collections),  and  complex  modifiers  corre- 

'We  use  restricted  quantification  of  the  form  (Qa where  Q  is  a  quantifier,  a  is  a  variable, 
and  $  and  ^  are  formulas.  That  is,  (Va:$^)  and  (3a:$^)  are  equivalent  to  (Va)$  — ►  and  (3a) 
$A  $,  respectively.  When  there  is  no  restriction  $,  we  write  (Qa  ty).  Also  note  that  we  use  square 
brackets  to  indicate  predicate  infix  expressions,  round  brackets  for  prefix  expressions,  and  angle  brackets 
for  unscoped  operators.  Scoping  of  quantifiers  and  other  operators  is  discussed  in  Schubert  &  Pelletier 
(1982),  Hurum  &:  Schubert  (1986),  and  Hurum  (1987,1988). 
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spending  to  adverbials  such  as  iadv-a  Aa[a  with-instrument  key3])2  which  means  “with 
the  key.” 

Several  operators  for  nominalizing  (reifying)  sentence  or  predicate  intensions  are 
also  available  in  episodic  logic,  including  a  proposition-forming  operator  That  and  kind¬ 
forming  operators  ( K ,  A'j  and  two  others)  as  illustrated  in  (4)  to  (7)  (again  neglecting 
tense) 

(4)  a.  That  Mary  smoked  is  not  true 
b.  -i[(That  [Mary  smoke])  true] ; 

(5)  a.  Mary  knows  that  John  is  intelligent 

b.  [Mary  know  (That  [John  intelligent])]; 

(6)  a.  Snow  is  white 

b.  [(K  snow)  white] ; 

(7)  a.  A  dog  is  a  mammal 
b.  [(Kj  dog)  mammal].3 


Another  distinctive  feature  of  our  logic,  responsible  for  its  name,  is  the  inclu¬ 
sion  of  episodic  variables.  Whereas  examples  (1)  -  (2)  dealt  with  “enduring”  properties 
(though  not  eternal  ones,  a  fact  for  which  we  eventually  allow),  (8)  and  (9)  below  involve 
“episodic”  ones: 

(8)  Everyone  looked  at  Mary; 

(9)  This  (event)  made  her  blush. 

The  final  representations  of  (8)  and  (9)  involve  a  connective  “**”  relating  a  sentence  to 
the  episode  it  characterizes  (or  completely  describes).  With  some  simplifications,  they 
are: 

(8')  (3el:[el  before  AW1] 

[(Vx:[x  person][x  look-at  Mary])  **  el]); 

(9')  (3e2:[e2  before  Now 2] 

[[el  cause-of  e2]  A  [[Mary  blush]  **  e2]). 

Thus  (8')  says  that  el  is  an  episode  characterized  (or  completely  described)  by  “Everyone 
looked  at  Mary,”  and  similarly  for  e2  in  (9'). 4 

2 adv-a  (standing  for  action-modifying  adverbial)  is  an  operator  that  forms  a  predicate  modifier  from 
a  predicate  over  actions  such  as  passionate  or  Aa[a  for-benefit  Mary]  which  means  “for  Mary.” 

3  K  is  used  in  the  interpretation  of  mass  nominals  like  snow  and  non-numeral  bare  plurals  like  dogs, 
and  K i  in  the  interpretation  of  indefinite  count  singulars  like  a  dog  and  bare  numeral  plurals  like  twelve 
eggs  (cf.,  p  and  p\  in  Schubert  and  Pelletier  1987). 

<Note  the  reduction  of  past  tense  to  a  relation  placing  episodes  el  and  e2  before  Nowl  and  Now2 
(more  accurately,  before  the  utterances  of  the  sentences  (8)  and  (9)),  respectively.  This  reduction  is 
obtained  from  an  initial  translation  involving  indexical  operator  past  -  see  section  4.  Also,  (8')  would 
be  further  expanded  to  show  individual  episodes  of  “person  r  looking  at  Mary,”  occurring  during  the 
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A  characterizing  description  of  an  episode  provides  all  the  facts  that  are  supported 
by  it,  except  possibly  for  ones  entailed  by  those  given,  via  meaning  postulates  (MPs). 
For  example,  blushing  entails  becoming  (more)  red  in  the  f«*ce,  and  so  “Mary  blushed, 
becoming  red  in  the  face”  (represented  logically)  could  still  characterize  episode  e2.  By 
contrast,  “Mary  blushed,  feeling  foolish”  cannot  characterize  the  same  episode,  only  a 
factually  more  comprehensive  one,  since  it  says  strictly  more  (i.e.,  feeling  foolish  is  not 
entailed  by  blushing,  via  MPs).  On  the  other  hand,  “Mary’s  face  changed  color”  cannot 
characterize  the  same  episode  either,  only  a  factually  less  comprehensive  one,  since  it 
says  strictly  less  (i.e.,  one’s  face  can  change  color,  e.g.,  by  turning  pale,  without  one’s 
blushing).  As  we  will  shortly  indicate,  the  notion  of  a  characterization  (or  complete 
description)  is  important  for  getting  causal  statements  right. 

Our  logic  also  contains  a  weaker  but  more  fundamental  operator  that  reads 
“partially  describes.”  is  essentially  an  object-language  embedding  of  the  semantic 
notion  of  truth  over  an  episode  or  situation.  (As  such,  [$  *  e]  is  similar  to  [$]*(e)  in 
(Reichenbach  1947);  and  to  e  f=  $  in  (Barwise  1989).)  For  example,  [[Mary  blush]  *  e ] 
entails  the  truth  of  [Mary  blush]  in  episode  e.  As  in  the  case  of  [[Mary  blush]  **  e],  the 
blushing  must  extend  over  the  entire  episode.  The  following  meaning  postulate  relates 
“**”  to  (where  $  and  are  schema  variables  over  formulas,  and  77  is  a  schema 
variable  over  terms): 

□  [[$  **  77]  — +  [['P  *  77]  — ♦  □  (Ve[[<I>  *  e]  — ♦  [$  *  e]])]], 

i.e.,  if  characterizes  77,  then  if  partially  describes  77,  it  does  so  as  a  matter  of  necessary 
consequence  from  $. 

To  see  the  significance  of  the  distinction  between  “characterizations”  and  arbitrary 
partial  descriptions  of  episodes,  suppose  that  the  “**”  in  (8')  were  replaced  by  Then 
(as  a  little  thought  shows)  (8')  and  (9')  would  be  true  in  a  situation  in  which  everyone 
looked  at  Mary,  laughing  derisively ,  and  it  was  this  (more  complex)  event  that  made 
her  blush.  Yet  we  would  not  say  that  (8)  and  (9)  are  both  true  in  such  a  situation;  so 
(8'),  with  “**”  weakened  to  would  not  be  a  correct  formalization  of  (8).  To  put  it 
schematically,  the  inference 

[$  **  el] ,  [$  **  e2] ,  [el  cause-of  e2]  h  [[ty  **  e2]  because  [<h  **  el]] 
is  sound,  while 

[$  *  el] ,  [$  *  e2] ,  [el  cause-of  e2]  h  [[^  *  e2]  because  [$  *  el]] 

is  not. 

Another  point  to  be  noted  above  is  the  free  occurrence  of  variable  el  in  (9'), 
outside  the  scope  of  its  quantifier.  This  is  permissible  in  our  logic  in  a  sequence  of 
conjoined  sentences  or  in  a  conditional,  thanks  to  a  “parameter”  mechanism  similar  to 

overall  episode  el.  Finally,  the  representation  of  (9)  would  actually  contain  the  logical  translation  of  the 
verb  phrase  make  Mary  blush,  viz.,  ((make  blush )  Mary),  from  which  (9')  would  be  derived  by  meaning 
postulates. 
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that  of  discourse  representation  theory  (DRT)  (c/.,  Kamp  1981,  Heim  1982).  In  effect, 
the  conjunction  of  (87)  and  (9')  would  be  interpreted  as  if  the  3-quantifiers  had  widest 
scope  (see  the  semantics  of  “3,”  “The,”  “A”  and  in  the  next  section). 

The  parameter  mechanism  is  also  the  key  to  representing  generic  conditionals,  such 
as  “A  wolf  is  (usually)  gray,”  “A  child  (usually)  loves  his  or  her  grandmother,”  or  “When 
two  strangers  meet  in  a  deserted  region,  they  often  greet.”  Generic  -'nditionals  take  the 
following  form: 

(3i,(3x2(-..(3x^)...)))-p^, 

where  $  and  9  are  formulas  involving  ii,x2,  • . .  ,ijt,  and  p  is  a  numeric  lower  bound 
on  frequency  (statistical  probability).  The  semantics  of  the  connective  “— *p”  essen¬ 
tially  allows  it  to  “take  control”  of  parameters  (existentially  quantified  variables)  in  the 
antecedent,  and  iterate  over  their  denotations.  Thus, 

(3x[x  wolf])  -+.a  [x  gray] 

says  that  a  wolf  is  usually  gray  (or,  most  wolves  are  gray).  Generic  conditionals  are  often 
used  in  causal  axioms.  In  particular,  predictive  causal  axioms  assume  the  occurrence 
of  some  particular  type  of  episode  el  in  the  antecedent,  and  predict  another  episode  e2 
caused  by  el  in  the  consequent.  The  following  is  an  example  of  a  predictive  axiom: 

When  a  predatory  animal  sees  a  non-predatory  creature  of  comparable 
or  smaller  size,  it  may  want  to  attack  and  eat  it.5 

[(3x:[x  predatory-animal] 

(3y:[[y  creature]  A(  — >[t/  predatory-animal])  A 
[[y  as-big-as  x]  V  [y  smaller-than  x]]] 

(3el[[x  see  y]  **  el]))) 

— >  6  [(3e2:[[(begin  e2)  during  el]  A[el  cause-of  e2]] 

[[x  want  (Ka  (attack  y))]  **  e2])  A 
(3e3:[[(begin  e3)  during  el]  A[el  cause-of  e3]] 

[[x  want  (Ka  (eat  y))]  **  e3])]] 

Equally  important  are  explanatory  axioms,  such  as  the  following: 

If  a  creature  wants  to  eat  some  food,  it  is  likely  to  be  hungry. 

[(3x(3el[[x  want  (Ka  Ay(3z:[z  food-for  y]  [y  eat  z]))]  **  el])) 

-+.9  (3e2:[[e2  cause-of  el]  A[e2  same-time  el]] 

[jx  hungry]  +  e2])] 

5A'o  below  is  an  operator  that  forms  a  kind  of  action  from  an  action  predicate  intension.  More  on 
this  later. 
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Mucii  of  the  world  knowledge  we  use  in  our  experimentation  is  in  fact  stated  as  causal 
axioms  like  these. 

We  indicated  at  the  outset  that  episodes,  as  we  construe  that  term,  are  the  same 
as  events,  situations,  circumstances  or  eventualities  (though  we  may  eventually  want 
to  treat  “events”  as  being  episodes  in  which  something  happens,  as  opposed  to  those 
in  which  some  state  or  process  persists).  We  carefully  avoided  including  actions  or 
activities  in  this  list  since  we  do  not  regard  these  as  being  of  the  same  type  as  episodes. 
For  example,  it  seems  to  us  that  (10)  and  (11)  may  describe  the  same  episode  or  event 
(an  exchange  of  a  car  for  a  'urn  of  money),  but  different  actions  (a  buying  and  a  selling) 

(10)  John  bought  the  car  from  Mary; 

(11)  Mary  sold  the  car  to  John. 

Note,  in  particular,  that  the  buying  ;n  (10)  may  have  been  performed  reluctantly  and 
the  selling  in  (11)  eagerly,  but  it  would  be  very  odd  to  say  that  the  events  described  in 
(10)  or  (11)  were  reluctant,  or  eager,  or  occurred  reluctantly  or  eagerly.  Events  simply 
do  not  have  such  properties. 

Nevertheless,  events  and  actions  seem  very  closely  related.  In  particular,  specific 
actions  appear  to  determine  specific  events  (namely,  the  performances  of  those  actions). 
The  difference  between  them,  we  suggest,  is  just  that  actions  have  well-defined  agents 
while  events  do  not.  Thus  it  makes  perfect  sense  to  ask  about  an  action  “Who  did  it?” 
(i.e.,  “Who  performed  the  action?”),  while  it  is  odd  to  ask  about  an  event  “Who  did  it?” 
(i.e.,  “Who  performed  the  event?”) 

If  specific  actions  determine  specific  events,  and  actions  have  well-defined  agents 
while  the  events  they  determine  do  not,  then  the  simplest  possible  theory  of  their  rela¬ 
tionship  is  this:  actions  are  ordered  agent-event  pairs.  Thus,  for  an  event  e  which  is  a 
performance  of  an  action  a  by  agent  x,  a  =  [x  |  e],  where  “|”  is  the  pairing  function. 
The  agent  of  an  action  is  then  just  the  first  element  of  the  action,  written  (fst  a),  and 
the  event  of  the  agent  performing  the  action  is  the  second  element  of  the  action,  written 
(rst  a).  We  have  found  this  view  very  helpful  in  the  analysis  of  adverbials  (some  of  which 
we  regard  as  modifying  events  -  e.g.,  locative  and  temporal  adverbials  -  and  some  of 
which  we  regard  as  modifying  actions  -  e.g..  manner  adverbials).  Also,  the  distinction 
has  resolved  some  persistent  difficulties  we  encountered  in  reasoning  about  actions,  such 
as  the  “wicked”  actions  of  the  wolf  in  the  story  of  Little  Red  Riding  Hood. 

We  should  remark  that  while  actions,  on  our  account,  are  agent-event  pairs,  the 
converse  is  not  in  general  true:  many  agent-event  pairs  are  not  actions.  For  instance,  if 
e  is  the  event  of  the  sun  rising,  then  [John  j  e]  is  certainly  not  an  action.  Whether  or  not 
an  agent-event  pair  is  an  action  depends  entirely  on  the  characterization  of  the  event,  a 
dependence  that  is  to  be  captured  by  meaning  postulates. 

In  this  connection,  we  should  mention  the  two  remaining  kind-forming  operators, 
Ke  and  A’a,  forming  kinds  of  events  and  kinds  of  actions  (more  generally,  attributes ) 
from  sentence  intensions  and  (action)  predicate  intensions  respectively  as  shown  in  the 
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following  examples 

(12)  a.  For  Mary  to  dance  was  rare 
b.  [(Ke  [Mary  dance])  rare]; 

(13)  a.  To  kiss  Mary  is  fun 

b.  [(Ka  (kiss  Mary))  fun]. 

Finally,  some  preliminary  remarks  are  in  order  about  the  inference  rules  we  use  in 
narrative  understanding.  One  very  general  inference  rule,  resembling  those  used  ir.  expert 
systems,  is  called  Rule  Instantiation  (RI)  (see  section  5  for  details,  where  the  dual 
of  RI,  called  Goal  Chaining,  is  discussed  as  well).  In  some  cases,  this  can  be  thought  of 
as  a  general  form  of  modus  ponens  with  universal  instantiation  and  use  of  multiple  minor 
premises  (instantiating  the  antecedent  of  a  universally  quantified  conditional).  However, 
it  also  allows  instantiation  of  generic  conditionals  such  as  the  predictive  and  explanatory 
axioms  above.  As  a  simple  example,  RI  allows  the  inference 

(3x[x  wolf])  —>.8  [x  gray],  [VV  wolf]'9 
[VV  gray]-*2 

The  superscripted  numbers  are  interpreted  as  lower  bounds  on  epistetvic  probabilities 
(in  contrast  with  the  statistical  interpretation  of  probabilities  modifying  the  connective 
in  generic  conditionals). 

Note  that  successive  inference  steps  of  this  type  will  lead  to  attenuation  of  the 
probabilities  assigned  to  successive  conclusions.  Such  inference  chaining  (with  safeguards 
against  certain  fallacies)  is  done  routinely  in  our  implementation.  A  more  problematic 
issue  is  the  “parallel”  combination  of  evidence.  This  issue  arises  when  several  generaliza¬ 
tions  (or  inference  chains)  assign  different  epistemic  probabilities  to  the  same  formula. 
A  partial  solution,  applicable  when  the  antecedent  of  one  generalization  is  more  specific 
than  the  other,  i.e.,  entails  the  other,  is  to  apply  only  the  more  specific  rule  (c/.,  Kyburg 
1983,  Bacchus  1988).  However,  this  leaves  open  the  question  of  how  to  combine  logically 
independent  (or  only  “probabilistically  dependent”)  bits  of  evidence.  We  hope  to  apply 
methods  similar  to  those  of  Pearl  (1988),  though  these  are  aimed  essentially  at  sentential 
reasoning  and  so  will  require  considerable  extension. 


3  Semantics 

In  an  attempt  to  develop  more  adequate  meaning  representations  (and  knowledge  rep¬ 
resentations)  for  NLU  systems,  there  is  a  trade-off  between  scope  and  rigor.  One  can 
either  cover  many  of  the  semantic  phenomena  of  language  in  a  rough-and-ready  fashion, 
or  a  few  of  them  with  a  completely  worked  out  formal  framework.  (For  some  sense  of 
the  work  left  to  be  done  in  situation  semantics  -  even  without  detailed  consideration  of 
linguistic  matters  -  see  Barwise’s  (1989)  essay  on  “branch  points.”)  As  AI  researchers 
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interested  in  building  working  NLU  systems,  we  believe  it  would  be  a  mistake  to  cut 
back  on  expressive  power  for  the  sake  of  theoretical  tractability.  Even  the  simplest  sto¬ 
ries  confront  us  with  all  the  traditional  conundrums  -  tense,  aspect,  adverbials,  generic 
sentences,  and  so  on.  The  best  strategy,  for  our  purposes,  is  a  “top-down”  one,  using  a 
representation  with  more  or  less  full  coverage  of  English  from  the  outset,  and  to  subject 
this  representation  to  continual  revision  in  the  light  of  the  interlocking  needs  of  grammar, 
computation  of  logical  form,  formal  semantics,  and  inference. 

Inevitably,  then,  many  uncertainties  and  gaps  remain  in  our  logical  semantics. 
However,  what  is  worthwhile  about  our  attempt  is  that  it  gives  detailed  expression  to 
the  intuitions  which  are  guiding  our  work,  subsumes  classical  logic,  provides  tentative 
extensions  in  several  major  directions,  and  is  sufficiently  carefully  formalized  to  make 
future  sys’  matic  analysis  and  revision  possible.  Moreover,  the  semantic  ideas  were  not 
conceived  .  isolation,  but  with  an  eye  on  the  mapping  from  surface  structure  to  logical 
form,  and  on  the  inferences  that  support  story  understanding.  In  these  respects,  the 
attempt  is  at  least  a  step  in  the  right  direction. 

Episodic  logic  is  in  some  ways  a  sequel  to  earlier  efforts  by  Schubert  and  Pelletier 
( 1982,  1989)  to  provide  a  first-order  logical  form  for  English,  including  generic  sentences. 
As  before,  the  attempt  is  Montague-inspired,  avoids  higher-order  predicates  (c /.,  Chier- 
chia  and  Turner  1988),  and  incorporates  a  DRT-like  treatment  of  indefinites.  However, 
the  new  logic  relies  crucially  on  the  episodic  operators  and  variables  introduced  in  the 
last  section  and  grounds  intensions  in  possible  situations  (or  episodes)  rather  than  possi¬ 
ble  worlds.  Also,  it  moves  away  from  the  amalgamation  of  contexts  and  interpretations 
in  Schubert  and  Pelletier  (1989)  (more  on  this  later). 

Our  semantics  of  situations  has  naturally  been  influenced  by  situation  semantics 
(Barwise  and  Perry  1983;  Barwise  1989),  even  though  we  do  not  share  situation  se- 
manticists’  distaste  for  possible  situations  (or  other  possible  entities),  logical  form,  and 
meaning  postulates.  From  our  rather  practical  point  of  view,  their  scruples  seem  a  lit¬ 
tle  perverse.  After  all,  situation  semantics  embraces  “nonactual  situations”  supporting 
nonactual  states  of  affairs/facts/infons;  employs  a  language  of  relational  tuples,  type 
abstraction,  support  (of  facts  by  situations)  and  constraint  relations  with  striking  re¬ 
semblances  to  familiar  logical  forms;  and  posits  constraints  which  are  similar  to  meaning 
postulates  in  all  but  name.  We  think  that  possible  situations  provide  a  simple  basis  for 
the  semantics  of  propositional  attitudes,  that  logical  form  is  indispensable  in  building 
NLU  systems,  and  that  “meaning  postulates”  is  a  good  term  for  those  general  statements 
which  are  taken  to  be  true  in  virtue  of  the  meaning  of  their  atomic  constituents. 

In  any  case,  the  essential  idea  we  have  adopted  from  situation  semantics  is  that 
sentences  are  used  to  describe  situations  (episodes,  events,  . . .),  which  are  limited  pieces 
of  reality  in  terms  of  the  objects  they  involve,  the  time  they  span,  and  the  sentences 
whose  truth  they  support.  We  implement  this  idea  by  treating  semantic  interpretations, 
and  the  valuation  functions  that  extend  them,  as  partial  functions  on  situations  ( cj 
Fcnstad  et  al.  1987).  In  contrast  with  situation  semantics,  episodic  logic  does  not  treat 
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properties  as  primitive,  though  it  does  admit  (reified)  properties  as  individuals.  Also,  we 
see  no  point  in  trying  to  treat  “the  facts  of  a  situation”  uniformly  as  relational  tuples. 
For  instance,  the  facts  {1^2,  2^1}  seem  to  be  the  same,  yet  the  tuples  <  =  ,  2,  1,  no>, 

<  =  ,  1,  2,  no>  are  distinct.  Similarly,  {1=1  V  1=2,  1=2  V  1  =  1,  1  has  the  property 

of  being  identical  v<ith  1  or  2}  all  seem  to  be  the  same  fact,  yet  <V,  <=,  1,  1,  yes>, 

<=,  1,  2,  yes>,  yes>,  <V,  <  =  ,  1,  2,  yes>,  <  =  ,  1,  1,  yes>,  yes>,  and  <[n  |  <V, 

<  =  ,  n,  1,  yes>,  <  =  ,  n,  2,  yes>,  yes>],  1,  yes>  are  all  distinct.  (That  is  not  to  say  we 

have  a  full  account  of  facts,  but  we  think  they  should  be  in  one-to-one  correspondence 
with  true  propositions,  where  propositions  in  turn  are  in  one-to-one  correspondence  with 
sentence  intensions.) 

To  speak  of  sentence  intensions,  and  even  of  sentences  describing  situations  (or 
situation  types),  is  not  necessarily  loose  talk.  Consider  a  context-dependent  sentence 
such  as  “I  am  with  you.”  It  is  usual  in  possible  worlds  semantics  to  say  that  a  specific 
intension  for  such  a  sentence  can  be  reached  only  by  way  of  an  index  (context)  which 
supplies  the  time  and  place  of  speech,  a  speaker,  addressee,  and  other  things.  However, 
it  is  perfectly  possible  to  put  the  burden  of  supplying  all  this  information  squarely  on 
the  interpretation  function,  i.e.,  the  function  that  supplies  basic  intensions.  So  this 
function  would  supply  well-defined  denotations  not  only  for  “Socrates”  and  “Picasso” 
but  also  for  “I,”  “you,”  and  “here”  and  “now,”  and  well-defined  intensions  not  only 
for  “love”  and  “walk,”  but  also  for  “be  with”  (which  can  mean  “be  near,”  “side  with,” 
and  other  things),  and  so  on.  This  is  a  well-known  option,  and  was  used  in  Schubert 
and  Pelletier  (1989),  where  interpretation  functions,  extended  to  valuation  functions, 
were  called  contexts.  (The  same  idea  also  seems  to  be  implicit  in  Barwise’s  (1989) 
“perspectival  situations.” ) 

However,  this  route  is  not  practical  for  computational  purposes.  Suppose  we  were 
to  represent  an  input  sentence  like  the  above  essentially  as  received,  e.g.,  as  (pres  [I  be- 
with  you]).  This  would  be  quite  useless  for  inference  in  conjunction  with  other  formulas 
obtained  in  other  circumstances  (so  that  pres,  /,  etc.,  would  have  no  fixed  meaning). 
At  the  very  least,  we  would  need  to  annotate  these  indexical  formulas  with  information 
about  the  contexts  in  which  they  were  obtained,  so  that  the  pres  and  I  (etc. )  of  one 
context  would  not  be  confused  with  the  pres  and  1  of  another  context,  featuring  a 
different  speech  time  and  speaker. 

But  even  such  annotated  formulas  would  be  impractical  for  inference,  as  far  as  we 
can  tell.  The  amount  of  contextual  information  that  would  have  to  be  included  in  the  an¬ 
notation  would  be  quite  large  (including  substantial  amounts  of  prior  discourse  to  provide 
unambiguous  referents  for  anaphora,  description  of  external  objects  and  circumstances 
to  provide  referents  for  deictic  phrases,  etc. ),  and  the  computational  burden  would  be 
correspondingly  heavy.  Hence  we  have  chosen  to  follow  the  usual  practice  in  the  design 
of  NLU  systems  of  assuming  that  the  initial,  indexical  representations  of  inputs  are  con¬ 
verted  to  a  nonindexical  ones,  using  the  contextual  information  which  would  otherwise 
have  to  be  permanently  attached  to  their  representation.  Note  that  “deindexicalized” 
representations  of  inputs  can  again  be  said  to  describe  situations,  and  to  have  intensions. 
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Figure  1:  Ontology  of  basic  individuals 


but  this  time  they  do  this  in  accord  with  a  fixed  interpretation  function,  rather  than 
one  that  changes  from  moment  to  moment.  Thus  deindexicalized  representations  will  be 
usable  in  concert  for  inference,  without  regard  for  their  origins.  The  following  discussion 
refers  to  this  non-indexical  form  of  episodic  logic. 

Model  structures  for  episodic  logic  are  based  on  a  very  liberal  ontology  of  possible 
individuals  V.  Our  aim  is  to  include  in  V  everything  we  can  talk  about.  Thus,  besides 
'ordinary”  individuals,  we  include  disjoint  sets  of  possible  situations  S,  propositions  V , 
and  properties  Q ,  and  kinds,  collections  and  n-tuples  ( n  =  2,3,...)  of  all  of  these. 
(Kind-,  collection-  and  tuple-formation  may  be  iterated,  i.e.,  there  are  kinds  of  kinds, 
collections  of  collections,  kinds  of  collections,  kinds  of  kinds  of  collections,  and  so  on.) 
The  reason  for  allowing  “possible  individuals”  is  that  ordinary  talk  abounds  with  them, 
as  in  “Sherlock  Holmes  is  a  fictitious  detective”  and  “Tomorrow’s  lecture  has  been 
cancelled”  (see  Hirst  1989).  Note  that  the  latter  sentence,  if  true,  refers  to  a  nonexistent 
(but  possible)  event.  Figure  1  shows  the  assumed  relations  between  the  basic  categories 
of  individuals. 

Total  functions  begin:  S—>Tl  and  end:  S—>71  determine  the  beginning  and  end¬ 
ing  “clock  times”  of  all  situations,  where  TZ  is  a  set  of  numbers  (possibly  the  reals  or 
rationals),  along  with  —  oo  and  +00.  For  all  situations  s,  begin(s)  <end(s).  We  say  s  is 
during  s'  iff  begin(s')  <begin(s)  <end(s)  <end(s'). 

Two  disjoint  relations  over  VxS,  namely,  Actual  and  Nonactual ,  determine  what 
entities  are  actual  and  nonactual  relative  to  a  situation.  Together,  these  are  the  par¬ 
ticipants  in  the  situation.  Situations  are  partially  ordered  by  a  part-of  (or  subepisode) 
relation  “C,”  where  s  C  s'  implies  Actual(s,s').  Moreover,  for  each  s  £S  there  is  at 
least  one  maximal  element  w  €  <S  (called  a  possible  world)  with  begin(w)  =  -00,  end(w) 
=  +00,  such  that  s  C  w.  The  corresponding  join  operator  U  forms  a  join  semilattice 
over  the  situations  which  are  Actual  relative  to  any  given  world  w.  U  is  understood  as 
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joining  two  episodes  into  a  (temporally  and/or  factually)  larger  episode.  (Note  that  this 
can  lead  to  episodes  with  temporally  disjoint  subepisodes.)  With  respect  to  this  opera¬ 
tor,  clock  times  obey  the  rules  begin(s  U  s')  =  min{begin(s),  begin  (s')}  and  end(s  U  s') 
=  max{end(s),  end(s')},  whenever  s  U  s'  is  defined.6  Consequently,  subepisodes  oc¬ 
cur  during  (possibly,  concurrently  with)  the  episodes  of  which  they  are  a  part,  i.e.,  for 
all  s,s'  eS,  s  O  s'  implies  s  during  s'.  For  example,  the  event  of  John’s  eating  the 
strawberries  greedily  has  as  a  concurrent  subepisode  his  eating  the  strawberries  (i.e.,  this 
subepisode  lacks  some  of  the  specifics  of  the  original  episode),  and  it  has  John’s  swallow¬ 
ing  the  strawberries  as  a  temporally  particil  subepisode  (since  as  a  matter  of  meaning, 
eating  something  involves  swallowing  it).  With  “during”  defined  as  above,  C  can  be 
specialized  to  a  “concurrent  part”  (or  concurrent  subepisode)  relation  i.e.,  s  <  s' 

iff  s  C  s'  and  s  during  s'  and  s'  during  s. 

We  can  now  identify  possible  times  X  as  those  situations  (n.b.,  situations ,  not  nu¬ 
meric  clock  times)  maximal  in  the  ^-ordering;  i.e.,  they  have  “maximal  content”  relative 
to  their  concurrent  parts.  Times,  in  this  sense,  are  arbitrarily  short  (possibly  momen¬ 
tary)  or  long  (possibly  unbounded)  time  intervals,  such  as  “this  moment,”  “the  year 
1990,”  or  “the  history  of  the  universe.”  Their  factual  content  is  “everything  that  hap¬ 
pened,  or  was  the  case,  during  that  time.”  Among  the  possible  times,  the  temporally 
unbounded  ones  are  thus  the  possible  worlds  W;7  the  temporally  minimal  ones  are  re¬ 
ferred  to  as  the  moments  of  time  T.  Note  that  W  and  T  are  the  traditional  indices 
of  possibility.  Two  functions  that  yield  moments  from  times  are  init  and  fin ,  where 
init(i)  is  the  initial  moment  of  i,  and  fin(i)  is  the  final  moment  of  i.  These  functions  are 
assumed  to  be  total  on  X.  Individuals  can  be  Actual  relative  to  any  number  of  worlds, 
and  must  be  Actual  or  Nonactual  relative  to  any  given  world.  In  general,  situations,  like 
other  individuals,  can  be  Actual  relative  to  any  number  of  worlds;  however,  times  are 
Actual  relative  to  exactly  one  world,  namely,  the  one  of  which  they  are  a  part,  and  Non¬ 
actual  relative  to  all  others.  Accordingly,  we  assume  a  function  world(i)  supplying  the 
unique  world  of  a  given  time  i.  Intuitively,  the  reason  for  this  assumption  is  that  times, 
being  factually  (propositionally)  maximal  relative  to  all  concurrent  situations,  already 
encapsulate  the  history  of  their  universe;  i.e.,  at  any  given  time  i,  the  fact  that  some 
earlier  event  occurred  is  a  fact  of  i,  and  similarly  for  future  events.  So  on  our  conception 
of  times,  the  world  component  of  a  world-time  index  is  redundant,  and  indeed  in  our 
semantics  there  is  just  one  index  of  possibility,  a  situational  one. 

We  have  also  positioned  a  set  of  possible  facts  T  in  our  ontology,  namely,  the  tem¬ 
porally  unbounded  situations  (including  the  possible  worlds,  which  are  thus  “maximally 
inclusive  possible  facts”).  So,  the  situation  that  2  +  2  =  4,  being  atemporal  (temporally 
unbounded),  counts  as  a  fact.  Actual  facts  are  surely  related  to  true  propositions  (as 

6The  binary  join  can  be  extended  to  a  set  join,  where  we  assume  that  for  a  set  of  situations 
£  which  are  Actual  relative  to  some  world  in,  begin(  u£)  =  glb{  begin  (s)|s  €£}€H,  and  end(  u£)  = 
lub{end(s)\s  €  £}  G  71. 

7The  universe  may  in  a  mathematical  sense  be  temporally  bounded,  but  on  our  intuitive  reckoning, 
it  cannot  be.  The  first  question  that  comes  to  mind  about  the  Big  Bang  is  “What  went  on  before  that?” 
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we  previously  suggested,  in  one-to-one  fashion),  but  as  Vendler  (1967)  argued,  cannot 
be  identified  with  them.  For  instance,  facts  can  be  discovered  and  can  be  due  to  other 
facts,  but  propositions  cannot  (only  the  fact  of  their  truth  or  falsity  can).8 

We  leave  open  the  exact  nature  of  facts  and  propositions,  but  assume  a  total 
function  ' :  S—*T,  giving  the  facts  of  a  situation  s  as 's.  For  a  fact  /,  '/  =  /,  and  for 
a  time  i,  'i  =  'world(i).  Also,  we  assume  a  one-to-one  partial  function  A  :  («S— ►  2)-*V, 
(interpreting  That)  for  forming  propositions  from  sentence  intensions  (here  (S  -*  2)  is 
the  set  of  partial  functions  from  S  to  2,  2  being  the  truth  values  {0,  1}  );  its  inverse  is 
v  :  2),  (interpreting  True )  for  forming  a  sentence  intension  from  a  proposition). 

Thus,  VAa  =  o  for  sentence  intension  <7,  and  AV?r  =  ir  for  proposition  7 r.  If  *±  is  the 
function  interpreting  the  characterizing  relation,  so  that  **:  (<S— >2)  — ►  («!>— >(<S— >2)),  it 
appears  that  the  fact  corresponding  to  proposition  n  is  (if  (vn  **  /)),  and  the  proposition 
corresponding  to  fact  /  is  (nr  (vtt  **  /)). 

A  basic  choice  that  needs  to  be  made  in  characterizing  an  interpretation  I  and 
its  extension  to  a  valuation  function  ||/  is  whether  types  of  situations  are  “outward 
persistent,”  i.e.,  whether  situations  (episodes)  are  of  the  same  types  as  their  temporally 
smaller  parts  (subepisodes).  If  an  episode  consists  of  John  eating  his  strawberries  and 
then  drinking  his  coffee,  should  we  regard  the  entire  episode  as  being  of  (tenseless)  type 
“John  eat  his  strawberries”  and  “John  drink  his  coffee”?  I.e.,  does  the  episode  as  a 
whole  support  the  truth  of  the  (tenseless)  subepisode  descriptions? 

It  is  tempting  to  answer  affirmatively,  since  truth-functional  compounds  are  then 
more  easily  evaluated.  For  instance,  a  conjunction  will  be  true  in  a  time  (interval) 
just  in  case  both  conjuncts  are  true  in  it  (modulo  the  necessary  provisions  for  DRT- 
like  handling  of  indefinites).  In  a  nonpersistent  approach,  we  instead  have  to  say  that 
the  conjunction  is  true  over  a  time  (interval)  just  in  case  each  conjunct  is  true  over 
a  sufiinterval,  and  the  two  subintervals  together  “span”  the  given  interval.  As  well, 
seemingly  valid  arguments  like  the  following  suggest  outward  persistence:  John  resigned 
today;  therefore,  he  resigned  this  week.  However,  both  kinds  of  evidence  are  problematic. 
First,  an  outward-persistent  semantics  in  which  $  is  true  of  a  time  interval  whenever 
it  is  true  of  a  subinterval  would  obliterate  information.  Suppose,  for  instance,  that 
John  solved  a  mathematical  problem  over  a  time  period  i,  and  in  the  course  of  doing 
so  solved  another  (smaller)  problem  over  subinterval  j.  Then  the  persistent  semantics 
obliterates  the  larger  problem-solving  episode  as  a  distinguishable  instance  of  “John 
solved  a  mathematical  problem,”  since  that  description  already  follows  by  persistence 
from  the  smaller  instance  over  interval  j.  This  becomes  important  when  one  applies 
temporal  qualification  such  as  “John  solved  a  mathematical  problem,  and  immediately 
wrote  down  the  solution,”  or  “John  solved  a  mathematical  problem.  It  took  him  five 
hours.”  These  qualifications  do  not  have  their  intended  force  if  the  episode  to  which  they 


*Situations  support  sets  of  facts  (or  the  truth  of  sets  of  propositions),  but  since  facts  can  also  be  facts 
about  situations  (and  thus  indirectly  about  sets  of  facts),  not  all  sets  of  facts  can  be  the  facts  of  some 
situation  (by  cardinality). 
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are  applied  could  be  one  which  was  arbitrarily  “inflated”  (via  outward  persistence)  to  a 
longer  episode. 

More  generally,  much  as  we  might  prefer  a  simple  boolean  algebra  as  a  basis  for 
the  truth  conditions  of  all  logically  compound  episode  descriptions,  truth  conditions  for 
tenseless  (and  therefore  potentially  temporal)9  sentences  simply  do  not  appear  to  be 
boolean.  For  instance,  [$  A  $]  apparently  does  not  entail  $  (for  temporal  sentence  $). 
Given  that  “Over  three  years,  Mary  had  a  baby,  and  a  baby,  and  a  baby,”  the  most 
natural  interpretation  is  one  in  which  the  overall  episode  described  is  one  containing 
three  separate  subepisodes  of  Mary  having  a  baby.  Yet  the  overall  episode  is  not  one  of 
Mary’s  having  a  baby;  rather,  it  is  one  of  Mary  having  three  babies.  At  the  same  time 
we  should  emphasize  that  on  our  view,  truth  values  of  atemporal  sentences  do  combine 
truth-functionally  at  any  given  time  (or  world).  Thus  nonindexical  predications  of  the 
sort  “Mary  has  a  baby  in  episode  £17  before  Now  12,  and  Mary  has  a  baby  in  episode 
£18  before  Nowl2,  etc.,"  will  combine  in  the  accustomed  boolean  fashion. 

As  far  as  the  above  linguistic  evidence  for  outward  persistence  is  concerned,  it 
carries  little  weight,  since  today  should  in  any  case  be  interpreted  as  “during  today” 
(which  of  course  entails  “during  this  week”).  For  if  we  take  “John  resigned  today”  as 
describing  a  day-long  episode  (rather  than  one  during  today),  we  will  have  great  trouble 
with  “John  resigned  today  in  a  five-minute  meeting  with  the  boss.” 

Thus,  we  opt  for  non-outward  persistent  temporal  situation  types,  and  hence  for 
truth  conditions  considerably  more  complex  (but  also,  we  claim,  more  accurate)  than 
those  usually  proposed  in  situation  semantics.  Atemporal  situation  types,  however,  are 
outward  persistent.  Moreover,  we  will  assume  a  more  restricted  sort  of  persistence  for 
all  situation  types,  including  temporal  ones,  namely  “upward  persistence.”  By  this  we 
mean  persistence  through  the  <  relation,  i.e.,  from  a  situation  to  a  concurrent  one  with 
more  “information.”  This  seems  natural  to  us  despite  the  following  sort  of  objection. 
Suppose  it  is  true  that  “Everyone  worked  hard  this  morning”  if  understood  as  referring 
to  today’s  events  at  the  office.  Shouldn’t  the  same  sentence  become  false  once  we  expand 
our  purview  to  include  more  of  the  worldl  Reasonable  as  this  objection  is,  we  think 
concurrent  upward  persistence  nevertheless  holds  for  deindexicalized  representations.  In 
such  a  representation,  a  phrase  like  everyone  would  have  been  replaced  by  something 
like  everyone  at  the  office,  obviating  any  need  for  repeated  inference  about  the  sense  of 
everyone  that  was  intended  at  the  time  of  its  use  (c/.,  Kratzer  1987).  The  assumption 
of  upward  persistence  allows  us  to  avoid  further  complications  in  the  specification  of 
interpretations  and  valuations.  Namely,  it  allows  us  to  use  values  of  expressions  at  times 
i  €  T  to  impose  more  general  constraints  on  values  at  situations  s  €  5,  via  upward 
persistence. 

9The  fact  that  only  untensed  sentences  can  be  temporal  needs  to  be  clearly  understood:  [Mary  marry 
John],  corresponding  to  tenseless  sentence  “Mary  marry  John,”  is  temporal  in  the  sense  that  it  applies 
truthfully  to  some  times  (at  which  marrying  occurs)  but  not  to  others.  By  contrast,  [[Mary  marry  John] 
*  E],  as  part  of  the  deindexicalized  representation  of  tensed  sentence  “Mary  married  John,”  is  atemporal 
(a  fact  or  falsehood);  it  is  true  at  all  times  (and  for  all  time),  or  at  none. 
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Before  stating  what  constitutes  an  interpretation  I  and  valuation  function  ([]]/, 
we  should  mention  that  our  semantic  rules  assume  “curried”  predicate  and  function 
application,  in  prefix  form.  The  “flattened”  infixed  form  of  predication  of  section  2  (with 
infixed  expressions  in  square  brackets)  is  obtained  by  some  rather  obvious  flattening  (i.e., 
bracket  deletion)  rules  and  an  infixing  rule  that  places  the  last  argument  of  a  predicate  in 
“subject”  position,  ahead  of  the  predicate.  (However  -  following  Reichenbach  (1947)  - 
we  take  the  sentential  argument  $  in  [$  **  77] ,  [$  *  7?],  and  [$  @  7/]  to  be  first,  semantically 
as  well  as  syntactically.  Also,  the  pairing  function  “  |  ”  and  dyadic  arithmetic  functions 
may  be  infixed,  with  the  function  symbol  following  its  first  argument.) 

Also,  we  should  emphasize  that  we  are  not  giving  a  definition  of  a  unique  valua¬ 
tion  [[]]/  corresponding  to  interpretation  /,  but  rather  a  set  of  constraints  on  legitimate 
valuation  functions.  In  fact,  we  wish  to  keep  these  constraints  quite  weak,  not  insisting, 
for  insta-  .  that  $  V  tk  will  be  true  if  one  of  the  disjuncts  is  true.  This  is  in  contrast 
with  the  "standard”  definition  of  disjunction  in  situation  semantics,  which  does  make 
that  stipulation  (Barwise  1989:  Fenstad  1987).  So  we  are  not  insisting,  for  example, 
that  a  situation  in  which  Mary  sleeps  must  be  one  in  which  Mary  sleeps  or  John  sleeps. 
John  may  not  be  involved  in  the  situation  even  to  this  extent.  As  a  result  our  sentence 
intensions  can  be  quite  “fine-grained”  -  sufficiently  so  to  avoid  the  usual  difficulties  with 
propositional  attitudes  such  as  belief  in  possible  worlds  semantics.  Belief  contexts  ap¬ 
parently  allow  some  substitutions  (such  as  $  V  #  for  V  and  $  for  -.1$)  but  not  ones 
that  “introduce  new  material.” 

As  a  final  preliminary,  we  need  to  introduce  the  notion  of  the  parameters  of  a 
formula,  for  use  in  our  DRT-like  treatment  of  indefinites  (in  generic  conditionals  and 
elsewhere).  Intuitively,  these  are  just  the  top-level  existentially  quantified  variables  of 
the  formula  (not  embedded  by  quantifiers  other  than  3  or  The).  In  essence  we  will  treat 
3  as  a  kind  of  “weak,”  or  “preemptable,”  quantifier.  It  has  its  usual  quantificational 
force  only  if  the  variable  it  quantifies  is  a  priori  valueless ;  otherwise,  it  is  “ignored” 
(though  it  still  serves  to  mark  its  variable  as  preemptable).  The  definite  quantifier  The 
will  be  treated  similarly  as  preemptable.  The  parameters  of  formula  $,  written  with  an 
underscore  function  as  £,  are  a  set  of  variables  defined  as  follows  (where  Q  is  a  non- 
preemptable  quantifier  such  as  V,  Most,  . . . ): 

(  3q:^^>  )  =  fol  U  $  U  :  (  Theoi££ )  =  £  U  £; 

[£_A_£]  =  [£_VJl]  =  [$  because  $  ]  =  £  U  £; 

-1$  =  [$  —  ff]  =  [$  ->p  $]  =  (Q cr.Q'V  )  =  0; 

□£  =  (certainly  $  )  =  ...  =  £; 

[  $  *  77  ]  =  [$  **  77  ]  =  [  $  @  Tj]  =  0 


If  d  G  V,  I  is  an  interpretation  of  the  atomic  symbols  of  the  logic,  and  a  is  a 
variable,  / ( a:d )  denotes  the  interpretation  identical  with  I  except  that  it  interprets  a 
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as  d  (regardless  of  whether  or  not  a  already  had  a  value  under  I).  Also,  if  d  is  a  tuple 
of  n  elements  of  V ,  and  $  consists  of  n  variables,  then  7($:g[)  denotes  the  interpretation 
obtained  from  I  by  setting  the  denotations  of  those  variables  in  $  which  have  no  prior 
values  to  the  corresponding  individuals  in  d  (e.g.,  make  the  assignments  in  lexicographic 
order  of  the  variables).  In  other  words,  if  some  of  the  variables  in  $  have  no  values 
under  I  and  some  do,  then  changes  only  the  interpretations  of  the  variables 

without  prior  values  to  the  corresponding  elements  of  d ,  leaving  denotations  of  variables 
with  prior  values  unchanged. 

The  conditions  satisfied  by  an  interpretation  /  and  any  of  its  extensions  to  a  val¬ 
uation  [[]]/  are  as  follows.  Wherever  j{J  occurs  unsubscripted,  it  is  an  abbreviation  for 
||/.  As  before,  we  use  2  =  {0,  1}  as  truth  values  and  write  A  — *  B  for  the  set  of  partial 
functions  from  A  to  B.  An  —>  B  will  abbreviate  (A— *-(A— ►. .  .(A— +5). . .)). 


1.  Type  of  atom  a 
Individual  constant: 

Individual  variable: 
n-place  predicate  constant: 
Predicate  modifier: 

Sentence  modifier: 

Predicate  nominalization  operator: 
Sentence  nominalization  operator: 


1(a)  is  an  element  of: 

V 

V,  or  is  undefined10 
Dn  _>  (£_*  2) 

(V  _  (5^  2))  -  (V  -  (5-  2)) 
(5-  2)  -  (5-  2) 

(V  -  (S->  2))  -  V 
(S-  2)  -  V 


At  this  point  it  would  be  appropriate  to  provide  characterizations  of  the  specific 
nominalization  operator  That  (denoting  A)  and  its  inverse  True  (denoting  v),  as 
well  as  of  the  VP  nominalization  operator  Ka  and  its  inverse  Has-prop.  Since 
we  take  That  to  apply  to  atemporal  sentences  only  (after  deindexicalization),  the 
propositions  formed  by  A  presumably  form  a  boolean  algebra  under  analogs  of  the 
truth-functional  connectives.  The  deindexicalization  of  infinitives  (and  gerunds) 
similarly  involves  introduction  of  explicit  episodic  terms,  leading  to  a  boolean  al¬ 
gebra  of  properties.  (These  are  properties  of  subject-episode  pairs,  not  of  subjects 
alone.)  However,  we  do  not  yet  understand  the  structure  of  propositions  and  prop¬ 
erties  well  enough  to  elaborate  these  algebras  and  their  relation  to  predicate  and 
sentence  intensions. 


2.  ( Upward  persistence  of  predicates.)  If  7 r  is  an  n-place  predicate  constant,  and  s,s' 
are  situations  such  that  s  <  s',  then  I(ir)(di) . . .  (dn)(s)  =  I(n)(di) .  ..(dn)(s')  for 
all  d\, . . . ,  dn  €  V  such  that  the  LHS  is  defined. 

3.  If  a  is  an  atomic  expression,  then  |a|=  1(a). 

4.  If  7r,a  are  expressions  such  that  |7r|€  A  — »  B  and  |a|€  A  for  some  sets  A,  B, 
then  |(tt  a)|  =  |tt|W  (i.e.,  |tt|  (|a|)).  (More  precisely,  we  ought  to  type  functions 

10 Note  that  if  a  variable  has  a  value,  that  value  “preempts”  3-  and  77ie-quantifiers  for  that  variable. 
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syntactically,  in  parallel  with  the  set-theoretic  types  in  1,  and  express  the  present 
rule  in  terms  of  the  syntactic  types  of  ir  and  a,  instead  of  their  (in  principle 
ambiguous)  set  memberships.  But  the  intention  should  be  clear.) 

5.  ( Upward  persistence  of  formulas.)  If  $  is  a  formula,  and  s,s'  are  situations  such 
that  s  ■<  s' ,  then  [[4>]ja=|$]]a,  if  the  LHS  is  defined. 

6.  ( Persistence  of  atemporal  formulas.)  If  $  is  an  atemporal  formula,  and  s,s'  are 
situations  such  that  s  C  s',  then  |$3a  =  U$Ja  H  the  LHS  is  defined. 

Atemporal  formulas  are  those  in  which  temporal  predicates  (such  as  walk,  girl, 
kiss)  occur  only  within  the  scope  of  *,  **,  or  @.  (Atemporal  predicates  such  as 
=,  cause-of,  and  before  can  occur  anywhere  in  an  atemporal  formula.) 

In  the  remaining  clauses,  $  and  $  are  formulas,  a  is  a  variable,  77  is  a  term,  and 
i  is  a  time  €  2.  Also,  it  will  be  convenient  in  the  semantics  of  3  and  The  to  let  P/(a) 
denote  V  if  /(a)  is  undefined,  and  singleton  set  {/(a)}  otherwise.  Further,  we  will  use 
“restriction  domains”  Va ($it  =  {d  |  for  some  time  j  C  i,  1}  in  the  semantics 

of  quantifiers  (clauses  10-13.)  Finally,  we  say  that  a  set  of  times  J  spans  a  time  i  if 
min{begin(j)  \  j  eJ}  =  begin(i)  and  max{end(j)  \  j  6  J  }  =  eud(i).  (For  infinite  J , 
we  use  gib  and  lub  instead  of  min  and  max.) 

7.  =  1  iff  for  all  times  j  C  t,  f$|J  =  0 ; 

=  0  iff  for  some  time  j  C  i,  j[$p  =  l.11 

E.g.,  ->[John  resign]  correctly  describes  a  time  interval  just  in  case  [John  resign]  is 
false  for  each  subinterval.  This  is  not  “outward  persistent”  for  temporal  <I>. 

8.  (a)  Symmetrical  conjunction  (using  a,  for  the  parameter  set  $_AJP) : 

A  $]]'  =  1  iff  for  some  deV l-l  and  some  times  j,  k  C  i, 

M/(£:<f)  =  I4'l/(2.:d)  =  1  and  k )  SPanS 
=  0  iff  for  all  de  and  all  times  j,  k  C  i, 

either  one  of  ,  WI{z:£)  is  0, 

or  both  are  1  but  { j ,  k }  does  not  span  i. 

(b)  Asymmetrical  conjunction  (using  o  for  the  parameter  set  $  fc  i.e.,  $  U  $) : 

I*  k  $1*  =  1  iff  for  some  deV^,  =  1 5 

=  0  iff  for  all  deV lal,  M^7  =  0,  or  =  0. 

11  In  general,  we  will  not  be  able  to  say,  even  for  times  i,  that  [$|’  =  0  wherever  [[$])’  yt  1,  because  the 
expressiveness  of  the  logic  inevitably  leads  to  truth  value  gaps  (or  inconsistency).  For  instance,  we  can 
express  the  Russell  sentence  [(K»  Ap  ->[p  Has-prop  p])  Has-prop  (K»  Ap  [p  Has-prop  p] )]  (“The  property 
of  not  being  self-applicable  has  the  property  of  being  self-applicable”). 
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Remarks:  Asymmetrical  conjunction  is  useful  for  abbreviating  the  semantics  of 
quantification.  Both  (a)  and  (b)  allow  for  forward  and  backward  anaphora,  through 
the  “parameter”  mechanism.  In  effect,  they  are  evaluated  as  if  existential  quanti¬ 
fiers  at  the  highest  level  in  $  and  $  had  wide  scope  over  the  entire  conjunction. 
(So,  “Some  man  x  is  ill,  and  x  is  coughing”  is  evaluated  as  “For  some  man  z,  x  is 
ill  and  x  is  coughing.”)12 

9.  v  =  1  iff  =  1  or  ftp  I*  =  1  ; 

=  0  iff  fl$|‘  =  0  and  =  0. 

10.  f(3o:$'I')l|t  =  1  iff  for  some  deVI{a),  &  $1  lI{a.d)  =  1  ;13 

=  0  iff  for  all  d  €  VI(a),  |[$  &  Pl‘/(cr:d)  =  0. 

Remarks:  Our  use  of  asymmetrical  conjunction  here  reflects  our  observation  that 
indefinites  seem  to  require  truth  of  the  restriction  predicate  at  least  at  the  beginning 
of  the  episode  described,  but  not  necessarily  at  the  end.  Examples  are  “A  child 
grew  up,”  “A  bubble  burst,”  “An  ice  cube  melted,”  etc.  Concerning  examples 
like  “A  planet  formed,”  “He  became  an  adult,”  or  “Mary  baked  a  cake,”  we  feel 
that  these  require  an  intensions!  account,  i.e.,  the  verbs  are  essentially  predicate 
operators.  Note  the  contrasting  valid  and  invalid  inferences  from  the  progressive 
forms:  A  child  was  growing  up,  therefore  there  was  a  child;  A  planet  was  forming, 
therefore  there  was  a  planet;  etc. 

For  sentences  like  “A  little  boy  (who  used  to  live  here)  is  now  a  man,”  and  “A  cer¬ 
tain  violinist  was  once  a  child  prodigy”  ( cf '  Eng  1981,  Hinrichs  1988),  we  conjecture 
that  there  are  two  additional  phenomena  involved:  (i)  A  nominal  (predicate)  oper¬ 
ator  sometime,  where  [a  ( sometime  7r )]  means  that  “a  is  at  some  time  a  ?r,”  i.e.,  it 
will  be  true  at  all  times,  if  it  is  ever  true  (so  the  result  is  atemporal).  This  appears 
to  be  involved  in  the  first  sentence,  and  in  most  sentences  with  eueni-nominals, 
such  as  “A  supernova  was  observed  at  Palomar.”  (ii)  A  scope  phenomenon,  in 
which  a  quantifier  escapes  from  the  scope  of  the  tense  operator,  thereby  getting 
its  temporal  reference  from  the  utterance  event;  this  seems  to  be  involved  in  one 
reading  of  a  certain  violinist  in  the  second  sentence,  viz.,  the  reading  where  this 
has  present  reference.  (In  another  reading,  ( sometime  violinist )  is  involved.) 

11.  [[(The  a:  <I>'P)]]‘  =  1  iff  for  the  unique  d6'P/(0)  satisfying  = 

1$  &  ^/(crd)  =  1 ; 

12Note  that  according  to  these  truth  conditions,  names  of  existentially  quantified  variables  matter. 
For  example,  [(3z:<t>'I')  A  (3x:<3?  -’'I')]  will  be  logically  false,  since  there  is  only  one  parameter,  x,  which  is 
varied  “simultaneously”  in  both  conjuncts.  On  the  other  hand,  [(3x:$->\I')  A  (3y:0  — '^ )]  has  the  usual 
truth  conditions  (unless  x,  y  have  prior  values). 

13Note  that  by  the  definition  of  Vi(a)  this  is  equivalent  to:  iff  1(a)  is  defined,  and  [$  A  ty]1  =  1,  or 
1(a)  is  undefined,  and  for  some  d  £  V,  1. 
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=  0  iff  for  the  unique  cf€'P/(Q)  satisfying  =  1, 

tt*  &  n'Iia:d)  =  0. 

Remarks:  This  leads  to  a  “referential”  reading  if  a  has  a  prior  value,  and  a 
situation-dependent  “Russellian”  one  otherwise,  except  that  we  have  deliberately 
left  open  the  question  of  whether  non-uniqueness  should  lead  to  falsity  or  unde¬ 
finedness. 

12.  |[(Va:  $$)]]*  =  1  iff  there  is  a  total  function  — >1  such  that 

for  all  dzVafij:  f(d)  Oi,  ~  and 

range(f)  spans  i ; 

=  0  iff  for  all  total  functions  /:X>Q,< -*I  such  that 
for  all  d  £  T>a#si ,  f(d)  C  i : 
either  for  some  d  £  VQ ,  [1$  &  =  0, 

or  range(f)  does  not  span  i. 

Remarks:  Note  that  the  episodes  quantified  over  may  be  arbitrarily  dispersed 
in  time,  as  in  “Every  premature  baby  was  put  in  an  incubator.”  However,  the 
restriction  predicate  is  assumed  to  apply  at  the  beginning  of  each  component  event, 
much  as  in  11.  (Again,  sometime  and  scoping  phenomena  can  lead  to  apparent 
violations  of  this  constraint.)  The  use  of  a  function  in  the  definition  is  aimed  at  the 
following  kind  of  problem.  Suppose  that  there  are  three  relevant  men,  Tom,  Dick, 
and  Harry;  Tom  gives  Mary  twelve  roses,  one  at  at  time;  then  Dick  and  Harry  each 
give  her  a  rose;  then  Tom  gives  Mary  a  dozen  more,  again,  one  at  at  time.  Can 
this  entire  event  be  characterized  as  each  man  giving  Mary  a  rosel  We  think  not, 
though  it  contains  two  such  events  (one  with  Tom  first,  the  other  with  Tom  last). 

Also,  note  that  as  it  stands  the  condition  for  truth  value  1  requires  a  nonempty 
domain  of  restricted  quantification,  VQ ,  since  the  empty  set  cannot  span  any 
time  interval.  This  could  easily  be  adjusted  to  give  truth  value  1  for  empty  VQ^:i . 

13.  [[(Most  a:  $^)J*  =  1  iff  there  is  a  partial  function  /:  — *■  I  such  that 

dom(f)  includes  more  than  half 14  of  T>a^tl ,  and 
for  all  d  €  dom(f),  f(d)  C  i  and  =  1,  and 

range(f)  spans  i ; 

=  0  iff  for  all  partial  functions  /:  P0,$t,'  — ►  1  such  that 

dom(f)  includes  at  most  half  of  Va^,i ,  and 
for  all  d  €  ,  f(d)  C  i : 

either  for  all  d£  dom(f),  =  0, 

or  range(f)  does  not  span  t. 

For  infinite  domains,  we  hope  eventually  to  use  a  distributional  approach,  similar  to  that  of  Bacchus 
(1988a, b). 
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Remarks:  Conditions  similar  to  12  and  13  can  be  formulated  for  other  monotone 
increasing  quantifiers,  such  as  many,  at  least  one  hundred,  etc.  These  are  quan¬ 
tifiers  Q  such  that  for  atemporal  $,  S'  and  T,  (Qa:S>S')  and  (Va:S»T)  together 
entail  (Qa:$T)  (see  Barwise  and  Cooper  1981).  We  assume  that  non-monotone 
increasing  quantifiers  can  be  recast  in  terms  of  monotone  increasing  ones  and  nega¬ 
tion.  Thus,  (Fewa:$^)  becomes  -i(Many  a:  S'S'),  and  (Exactly-no:  $$)  becomes 
(At-least-n  a:  3>S/ )  A  ->(More-than-n  a:  3>S* ),  etc. 

14.  ►S']]'  =  1  iff  |[$]p  =  0  for  all  times  j  C  i ,  or  A  S’]]'  =  1 ; 

=  0  iff  |$p  =  1  for  some  time  j  C  i ,  and  A  S']]’  =  0. 

For  s  any  situation,  ►  ’£]js  =  0  if  |S>JS  =  1  and  A  S']*  is  undefined. 

Remarks :  Through  8(a),  this  plausibly  handles  many  nongeneric  “donkey  sen¬ 
tences,”  such  as  “If  Pedro  owns  a  donkey,  he  will  ride  it  to  town  tomorrow”  (cf., 
Schubert  and  Pelletier  1989).  This  will  not  require  Pedro  to  ride  all  his  donkeys  to 
town. 

The  third  condition  ensures  that  $-+>!/  always  has  a  truth  value  if  its  antecedent  is 
true.  This  strengthens  the  conditional  sufficiently  so  that  it  can  be  used  together 
with  necessity  to  express  meaning  postulates  (analogous  to  constraints  in  situation 
semantics),  guaranteeing  that  if  a  situation  contains  the  “antecedent  information," 
it  also  contains  the  “consequent  information.” 

15.  =  1  iff  for  all  times  j  €  I,  |$]p  =  1  ; 

=  0  iff  for  some  time  j  €  I,  =  0. 

16.  [[$— >p'P]]‘  =  1  iff  for  “at  least  a  proportion  p"  of  elements 

deP1-1  such  that  =  1, 

E*  A  =  1 ; 

=  0  iff  for  “more  than  a  proportion  (1-p)”  of  elements 

rf  G  I  such  that  =  1, 

I*  A  *!/(»:<*)  =  0- 

Remarks:  This  is  only  a  rough  approximation  to  what  is  required.  First,  the  talk  of 
“proportions”  needs  to  be  replaced  by  a  notion  of  measure,  based  on  a  distribution 
over  individuals.  Second,  instead  of  using  purely  “extensional  statistics”  (propor¬ 
tions  of  ^-instances  relative  to  ^-instances  at  i),  the  truth  conditions  should  in 
general  be  modalized  to  reflect  the  nomic  character  of  many  generic  sentences; 
this  could  be  done  by  evaluating  the  antecedent  and  consequent  not  only  at  i, 
but  also  at  “nearby”  Nonactual  times  extending  over  the  same  clock-time  range 
as  i.  Third,  the  definition  as  it  stands  does  not  properly  accommodate  backward 
anaphora,  as  in  “If  he  owns  a  donkey,  a  man  often  beats  it.”  There  are  also 
some  subtle  problems  that  cannot  be  discussed  here  concerning  which  parameters 
(indefinites)  should  be  “iterated”  over,  and  which  ones  should  not. 
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Nevertheless,  the  rule  is  sufficiently  accurate  to  be  a  guide  to  action.  Given  that  $ 
holds  at  a  time  for  certain  values  of  its  parameters,  it  is  reasonable  on  the  basis  of 
much-discussed  inductive  principles  (e.g.,  Kyburg  1983,  Bacchus  1988a, b)  to  assign 
conclusion  'P,  for  those  values,  degree  of  certainty  >p  (at  least  in  the  absence  of 
other  information). 


17.  [[$  *  77]]*  =  1  iff  [J77]]  C  world(i)  and  =  1  ; 

=  0  iff  ([77]]  2  world(i)  or  ^  1. 

Remark:  Thus  *  denotes  truth  over  an  (actual)  episode. 

18.  [$  6  7/fl*  =  1  iff  $  is  stative  and  *  77]]'  =  1,  or 

$  is  nonstative  and  there  is  an  s  C  ([77]]  such  that 
([$  =  1,  where  e  is  a  variable  not  occurring  in 

=  0  iff  <f>  is  stative  and  |$  *  7?]]'  =  0,  or 

4>  is  nonstative  and  there  is  no  s  C  [77]]  such  that 

I**  =  1,  where  e  is  a  variable  not  occurring  in  4>. 

One  can  also  define  the  connective  syntactically  as  follows: 

,  _  ,  _  f  [$  *  77]  if  4>  is  stative; 
l  n  41  \  (3e:  [e  subep  77][$  *  e\)  if  $  is  nonstative, 

where  subep  corresponds  to  the  C  relation,  i.e.,  Jsubep]]  €  X>2-*5— > 2  such  that  for 
all  sl,s2  €  S  and  w  €  W,  Jsubep|S!  42  w  =  1  iff  s?  C  sj . 

Remark:  The  stative/nonstative  distinction  plays  an  important  role  in  most  theo¬ 
ries  of  aspect  and  aspectual  class  (e.g.,  Dowtv  1979,86,  ter  Meulen  88).  We  make 
use  of  the  distinction  in  the  interpretation  of  perfective  aspect  (see  next  section) 
and  in  certain  meaning  postulates  (see  footnote  29)  and  plausible  inference  rules 
(similar  to  “causal  connection”  in  section  5).  Our  aspectual  classes  are  quite  rudi¬ 
mentary  compared,  for  instance,  to  Moens  and  Steedman’s  (1988)  who  have  a  very 
detailed  (but  informal)  analysis  of  the  interaction  between  temporal  reference  and 
aspectual  classes.  One  test  for  stativity  is  compatibility  with  durative  adverbials 
(was  ill  for  two  days,  waited  for  an  hour  versus  *resigned  for  two  minutes,  *plantcd 
a  tree  for  an  hour).  Note  that  on  this  test  negative  VPs  (such  as  has  not  planted 
a  tree  (for  an  hour))  are  stative. 

19.  **  t;!'  =1  iff  *  77]]'  =  1,  and 

for  all  propositions  p£P  such  that  vp([fT7j)  =  1,  and 
for  all  s  6  5,  if  |$])4  =  1  then  vp(s)  =  1  ; 

=  0  iff  *  77J*  =  0,  or 

for  some  proposition  p  6  V  such  that  vp(Hr/Jj)  =  1,  and 
for  some  s  €5,  =  1  and  vp(s)  ^  1. 

Remark:  This  expresses  that  if  77  is  completely  described  by  $,  then  the  only 
additional  properties  it  has  are  entailments  of  4>. 
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20.  For  n  a  formula  or  n-place  predicative  expression  (n>l), 

JAo  jrJ  =  {<d,  |7r|a;(f>  |  d  £  V,  [tt]]^  defined  }. 

21.  For  T\ ,  r2  terms, 

|rj  =  r2J*  =  1  iff  [[rj J  =  with  both  defined; 

=  0  iff  |ti|  j-  [r2J,  with  both  defined. 

This  completes  our  enumeration  of  truth  conditions.  Primary  omissions  are  a  de¬ 
tailed  algebraic  specification  of  the  various  nominalization  operators  and  their  inverses, 
and  the  semantics  of  all  but  a  few  quantifiers  (which  can  be  '‘filled  in”  moi*.  or  less  anal¬ 
ogously).  Also,  we  have  not  specified  a  semantics  for  the  important  sentential  connective 
because ,  about  which  we  know  little  more  than  we  said  in  section  2.  (There  are  as  well 
additional  operators  in  our  logic  relating  to  questions  and  te/i-nominals  which  we  have 
not  mentioned,  for  lack  of  semantic  details).  Finally,  we  have  not  given  any  semantics  of 
functions  over  individuals  though  clauses  1-4  might  easily  have  been  formulated  to  cover 
them.  The  reason  is  that  we  are  not  convinced  that  any  words  of  phrases  of  English 
express  functions  from  individuals  to  individuals.  One  commonly  sees  such  relations  as 
father  of  treated  functionally  in  NL  systems,  but  it  seems  to  us  that  it  is  a  matter  of 
contingent  world  knowledge,  not  a  matter  of  logic,  that  fathers  are  unique  (and  genetic 
advances  might  falsify  this).  Even  such  relations  as  weight  of  seem  to  us  only  contin¬ 
gently  functional,  and  more  importantly,  language  appears  not  to  distinguish  apparent 
functions  like  weight  of  from  evident  relations  like  neighbor  of  or  advantage  of  One 
function  we  do  use  is  the  pairing  function,  whose  semantics  is  simply  |[a  |  /?]]]  =  <JqJ, 
Jd|>  (an  ordered  pair  which  is  defined  whenever  both  elements  are  defined).  Also,  we 
did  mention  arithmetic  functions  at  one  point,  which  we  find  convenient,  but  they  have 
little  bearing  on  our  concerns  in  this  paper. 

Another  postsr  ipt  to  our  truth  conditions  concerns  the  fact  the t  our  semantics  of 
constants  in  clause  1  makes  them  rigid  designators,  denoting  specific  individuals  indepen¬ 
dently  of  situations.  This  in  turn  leads  to  rigid  identity  conditions,  as  in  the  final  clause 
21.  The  rigidity  assumption  keeps  truth  conditions  for  predication  and  quantification 
relatively  simple,  and  straightforwardly  accounts  for  de  re  readings  in  modal  contexts. 
On  the  other  hand,  it  invites  trouble  with  de  dicto  readings  such  as  are  needed  to  resolve 
the  Morning  Star  -  Evening  Star  puzzle. 

A  nice  example  from  our  target  story,  Little  Red  Riding  Hood,  is  the  circumstance 
that  the  heroine  (let  us  call  her  LRRH)  thought  the  wolf,  in  Grandmother’s  clothes  and 
lying  in  her  bed,  was  Grandmother.  This  is  not  representable  as 

[[LRRH  believe  (That  [[W  =  G]  *  El])]  **  E2], 

for  some  episodes  E 1,  E2.  LRRH  certainly  did  not  think  W  (the  wolf)  and  G  (Grand¬ 
mother)  were  one  and  the  same  individual.  Rather,  we  need  a  de  re  reading  for  VV  and 
a  de  dicto  reading  for  G,  i.e..  LRRH  believed  of  W  that  he  was  G. 

One  possibility  would  be  to  associate  an  individual  concept  ca  €5— with  each 
constant  o.  and  in  effect  to  “scope”  constants  using  a  quantifier-like  ( Loader  $)  con- 
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struct  to  provide  a  local  value  for  a  via  ca.  For  instance,  instead  of  the  above  we  would 
have 

[[LRRH  believe  (That  (Load  G  [[W  =  G]  *  El]))]  **  E2], 

where  for  a  top-level  formula,  all  constants  are  assumed  to  be  initially  Zoad-ed  with 
ca(t)  for  the  time  of  evaluation  i.  More  deeply  embedded  “ Load  a” s  will  override  those 
with  wider  scope,  for  the  same  a.  The  semantics  of  Load- ing  would  be 

U  Load  a  *)]|J  =  W;(o:Ca(3))  . 

Note  that  apart  from  changes  in  clause  1,  other  truth  conditions  would  not  be  affected 
since  constants  in  such  a  scheme  are  still  interpreted  as  individuals,  though  now  they 
may  be  different  individuals  in  different  situations.15 

Entailment  in  our  logic  is  defined  for  a  set  of  premises  E  and  a  conclusion  #  as 
follows.  We  use  the  notion  of  an  MP-constrained  valuation  function  |J,  i.e.,  one  which 
conforms  with  the  constraints  we  have  stated  and  also  satisfies  f<l>]]u'  =  1  for  all  worlds 
w  €  >V  and  all  formulas  $  designated  as  meaning  postulates.  Then 

E  ^  iff  for  all  MP-constrained  valuation  functions  (]]] 

and  all  worlds  w  £  W  such  that  [<I>]]U'  =  1  for  all  $  g  E, 

i»r  =  i- 

Note  that  by  defining  entailment  for  worlds,  it  automatically  applies  only  to  atemporal 
formulas  (since  only  atemporal  ones  can  be  true  in  a  world,  i.e.,  for  all  time). 

We  will  prove  the  following  entailment,  as  an  illustration  of  how  the  truth  condi¬ 
tions  for  times  interact  with  the  persistence  conditions  (5)  and  (6): 

For  a  formula  and  r)  a  term, 

[(-,$)  *7]}  |=  (Vf:[[<  time]  A  [t  during  77] ]  (->[$  *  *])), 

with  I(time)eV-~S-*2  such  that  for  all  s  GS  and  i  Gl,  I(time)(s)(i)  -  1  if  s  G 1  and 
,«  C  7,  and  =  0  otherwise  (so  that  only  Actual  times  satisfy  the  predicate);  and 
I(during)  €  V2  ->S-—2  such  that  for  all  S\,s2  GS  and  i  Gl,  I(during)(si)(s2)(i)  = 
1  if  bcgin(si)  <  bcgin(s2)  and  end(s2)  <  end(s  1),  and  =  0  otherwise. 

Proof.  For  any  world  w  €  W,  ([(-'$)  *  =  1  implies  [[77]]  C  w  and  =  1 

(by  17).  Hence  if  i  is  the  time  concurrent  with  [r/Jj  in  w,  |[-’$]],  =  1  (by  upward 
persistence,  5).  Hence,  for  all  times  j  C  i,  =  0  (by  7).  Hence  for  all  times 
i  E  h  [[♦  *  <]]/(,;j)  =  0  (by  17).  Hence  for  all  times  j  C  i,  ([->[$  *  =  1  (by 

ls  A  plausible  way  of  introducing  Load  in  the  representation  of  a  sentence  like  the  above  may  be  through 
a  semantic  rule  that  interprets  a  name  N  serving  as  complement  of  a  copular  verb  as  Ax(Load  N'[r  = 
N']),  where  N'  is  the  logical  translation  of  N.  This  would  explain  why  “LRRH  believed  that  the  wolf 
was  Grandmother”  is  not  synonymous  with  “LRRH  believed  that  Grandmother  was  the  wolf.”  Note 
that  Load  must  be  assumed  to  block  substitution  for  free  variables  in  its  scope,  if  the  substituted  values 
involve  the  “Loaded”  constant  (at  least  in  inlensional  contexts). 
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7).  Hence  for  all  times  j  C  i,  [[->[$  *  *]]/(«•_,)  =  1  (by  persistence,  6).  Then  by  the 
constraints  placed  above  on  I(time)  and  I(during),  and  the  truth  conditions  for 
“V”  and  “A,”  [](Vf:[[f  time]  A  [<  during  t?]]  (->[$  *  t]))]]*"  =  1.  D 

4  From  English  to  Episodic  Logic 

It  is  one  thing  to  posit  logical  forms  for  particular  sentences,  but  quite  another  to  gen¬ 
erate  them  systematically  through  a  grammatical/semantic  formalism.  An  important 
advantage  of  our  representation  is  that  it  can  be  directly  and  uniformly  computed  from 
syntactic  analyses  of  input  sentences. 

We  conceive  of  this  computation  els  involving  three  processing  phases  (to  be  inter¬ 
leaved  eventually).  First,  we  use  a  GPSG-style  grammar  to  compute  indexical  (context- 
dependent)  translations  with  ambiguously  scoped  quantifiers,  connectives,  and  tense 
operators,  such  as 

[<3  person >  <past  kiss>  Mary] 

(“Someone  kissed  Mary”).  Second,  we  scope  the  unscoped  operators,  obtaining  formulas 
such  as 

(past  (3x:[x  person][x  kiss  Mary])). 

This  is  still  indexical  in  that  it  is  past  relative  to  some  implicit  utterance  time,  and 
perhaps  also  involves  implicit  relations  between  the  described  event  and  previously  de¬ 
scribed  events.  Third,  we  combine  the  indexical  translation  with  a  context  structure 
(whose  most  important  component,  for  present  purposes,  will  be  a  stack  of  temporal 
structures  called  tense  trees),  and  then  apply  equivalence  transformations  to  the  com¬ 
bination,  which  recursively  eliminate  the  dependence  on  context,  ultimately  giving  the 
desired  nonindexical  (context-independent)  translation.16 

We  will  illustrate  the  derivation  of  a  logical  form  for  the  sentence 
(14)  John  inferred  that  Mary  was  tired. 

Here  is  a  GPSG  fragment  adequate  for  (14),  where  each  lexical  or  phrase  structure  rule 
is  paired  with  a  corresponding  semantic  rule: 

A.  NP  — ♦  Mary,  Mary 

B.  NP  — *  John;  John 

C.  A[pred]  — ♦  tired;  tired 

D.  AP  -  A;  A' 

16  In  the  KR’89  paper,  we  tried  to  formulate  the  semantic  rules  of  the  GPSG  grammar  so  that  they 
immediately  introduce  episodic  variables  and  and  “**”  operators,  and  express  tense,  aspect  and 
adverbials  in  terms  of  relations  between  episodes.  However,  this  ran  into  difficulties  with  the  interaction 
between  tense,  perfective  aspect,  negation  and  time  adverbials.  The  unnatural  t,  e,  and  h  features  in 
our  previous  fragment  are  symptoms  of  some  of  these  difficuHies. 
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E.  V[be,  past,  3per,  sing,  sub]  — »  was ;  AP<subpast  P>17 

F.  V[be,  past,  3per,  sing,  sub]  — ►  was\  AP<past  P> 

G.  VP  -  V[be]  APfpredj;  (V'  AP') 

H.  S  -*•  NP  VP;  [NP'  VP'] 

I.  S[that,  tense18]  — *•  COMPLfthat]  S[sub,  tense];  (That  S') 

J.  V[_S[that],  past]  — *•  inferred ;  <past  infer> 

K.  VP  -  V[_S[that]]  S[that];  (V'  S') 

Certain  feature  principles  are  assumed  here  -  namely,  certain  versions  of  the  head  fea¬ 
ture  principle,  the  control  agreement  principle,  and  the  subcategorization  principle  (c/. 
Gazdar  et  al.  (1985)  and  Pollard  and  Sag  (1987)). 19 

The  initial  translations  computed  for  (14)  are  then 

(14' a)  [John  <past  infer>  (That  [Mary  <subpast  tired>])],  and 

(14' b)  [John  <past  infer>  (That  [Mary  <past  tired>])]. 

Note  that  we  get  two  translations  since  rules  E  and  F  provide  alternative  semantic 
formulas  for  past-tensed  be  in  embedded  sentences.  We  will  pursue  only  (14' a). 

The  second  processing  phase  here  consists  of  “raising”  the  occurrences  of  past  and 
subpast  to  a  permissible  sentential  level.  (We  follow,  e.g.,  Richards  and  Heny  (1982)  in 
treating  tense  as  sentence  operators  with  wide  scope  over  adverbials.  This  is  in  contrast 
with  Hinrichs  (1988)  who  gives  tense  scope  only  over  the  predicate  that  corresponds  to 
the  main  verb.)  This  has  the  following  unique  result,  since  That  acts  as  “scope  trap”: 

(14")  (past  [John  infer  (That  (subpast  [Mary  tired]))]). 

In  the  third  phase,  we  need  to  combine  this  with  a  context  structure  for  the 
utterance  and  “deindexicalize”  it.  This  involves  application  of  a  fixed  set  of  equivalences 
called  deindexicalization  rules ,  of  which  we  will  show  only  a  few,  as  needed.  A  context 
structure  may  in  general  contain  such  objects  as  a  history  list  (with  recent  referents),  a 
focus  list  (with  recently  focused  entities),  and  other  contextual  information  (see  Allen 
1987).  However,  as  mentioned,  the  only  part  of  the  context  structure  we  will  be  concerned 
with  here  is  the  stack  of  tense  trees,  specifically,  the  current  (most  recent)  tense  tree.  A 
node  of  a  tense  tree  may  have  up  to  three  branches,  where  leftward  branches  correspond 
to  past  tense  (going  to  an  earlier  episode),  straight  downward  branches  to  perfective 
aspect  (going  to  an  earlier  episode  or  one  just  ending),  and  rightward  branches  to  future 

17  “sub”  is  a  head  feature  that  indicates  that  tiie  constituent  appears  in  a  verb  complement  clause  (e.g., 
the  object  of  a  propositional  attitude);  “subpast”  means  semantic  subordinate  past.  Note  that  not  every 
syntactically  subordinate  past  is  semantically  subordinate  past. 

18  Rather  than  treating  features  as  attribute-value  functions,  as  in  standard  GPSG,  we  treat  them  as 
trees.  In  this  instance,  tense  is  the  root  of  a  feature  tree  with  daughters  pres  and  past.  A  tree  node  is 
considered  compatible  with  any  of  its  ancestors  or  descendants. 

19The  subcategorization  principle  obviates  the  need  for  explicit  rules  like  K,  but  we  show  the  rule  for 
greater  clarity.  The  departures  from  the  usual  formulations  are  not  important  for  present  purposes. 
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modality  (going  to  a  later  episode).  Each  node  contains  a  list  of  (recently  mentioned) 
episodes.  A  tense  tree  always  has  exactly  one  node  in  focus,  marked  as  ® ;  and  processing 
of  a  top-level  sentence  always  begins  with  the  root  node  in  focus  and  a  token  for  the 
utterance  of  the  sentence  as  the  rightmost  (most  recently  added)  element  of  its  episode 
list. 

Let  us  assume  that  (14)  was  preceded  by  another  sentence  such  as  “Mary  looked 
pale.”  Then  the  initial  tense  tree  for  (14)  might  be 

uO,  ul 


where  uO  and  ul  denote  the  utterances  of  the  preceding  sentence  and  sentence  (14), 
respectively,  and  eO  denotes  the  episode  reported  in  the  preceding  sentence  (Mary’s 
looking  pale). 

Our  task  is  to  deindexicalize  (14")  using  a  context  structure  containing  this  tree 
as  its  current  ten«c  tree.  We  normally  write  an  indexical  formula  $  in  combination  with 
a  context  structure  C  as  4>c,  but  since  we  are  neglecting  all  but  the  current  tree  T  here, 
we  shall  instead  write  it  as  4>t-  For  (14")  and  the  assumed  T,  this  is 

(15)  (past  [John  infer  (That  (subpast  [Mary  tired]))]) 

®  uO, ul 
/ sO 

The  first  relevant  deindexicalization  rule  is  for  deindexicalizing  past : 

Past:  (past  4>)t  (3er:[[ea  before  Ref?]  A  [e?  successor-ep  Ref yj]] 

(*o/t  **eT])- 

Here  Refr  denotes  the  most  recently  stored  episode  at  the  focal  node  of  T.  ej  is  the 
“next  episode  variable  not  yet  used  in  T.”  This  is  assumed  to  be  uniquely  defined  for  a 
given  T20;  for  instance,  it  might  be  the  letter  e  followed  by  the  least  numeric  suffix  i  such 
that  ei  does  not  occur  anywhere  in  T.  (called  “shift  left”)  shifts  the  focus  to  the 

left  daughter,  i.e.,  to  the  past,  creating  a  new  daughter  if  none  yei  exists.  Note  that  the 
rule  explicitly  records  the  fact  that  ej  appeared  in  the  linguistic  context  as  successor 
episode  to  Ref yj  (which  denotes  the  most  recently  stored  episode  at  the  focal  node 
of  /I,  i.e.,  in  our  example,  eO).21  This  successor-ep  predication  can  later  be  used  lo 
make  probabilistic  narrative  inferences  about  the  temporal  or  causal  relations  between 
the  two  episodes  (see  section  5).  Next,  “o”  (called  “store”)  is  a  function  that  operates 
on  a  tense  tree  T,  placing  a  new  episodic  variable  ej  at  the  focal  node;  i.e.,  RefoT  =  ej. 
Thus,  o/T,  that  is,  (o(/T)),  denotes  the  transformed  tense  tree  obtained  from  T  by 
first  shifting  the  focus  down  to  the  left,  and  then  storing  the  new  episode  ej  at  the  new 

20 Actually,  for  a  given  C,  but  we  are  neglecting  the  rest  of  C  here. 

21  If  Ref  yj  is  undefined,  the  successor-ep  predication  is  omitted. 
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focal  node. 


Applying  the  Past-rule  to  (15),  we  obtain 

(16)  (3el:[[el  before  ul]  A  [el  successor-ep  eO]] 

[[John  infer  (That  (subpast  [Mary  tired]))] 


<?(e0 


uO.  til 


**el]) 


eO,  el 


Note  that  both  the  translation  and  the  focus  of  the  tense  tree  now  contain  the  token 
el  for  John’s  “inferring”  episode.  Since  the  new  tense  tree  is  associated  with  the  pred¬ 
ication  [John  infer  (That  . . .)],  a  predication  deindexicalization  rule  (which  supplies  the 
tense  tree  to  the  last  argument)  is  applied: 

P-'-ad:  For  rr  an  atomic  predicate  and  t\, . . .  ,rn  terms,  where  ry,  1  <  i  <  n, 
is  atomic,  except  possibly  for  rn:22 
[[rx  ir  r2  ...  rn]T  **  tj]  ~  [[n  tt  r2  ...  rn_:  r„T]  **  77] 


The  result  of  applying  Pred  to  (16)  is 

(17)  (3el:[[el  before  ul]  A  [el  successor-ep  eO]] 

[[John  infer  (That  (subpast  [Mary  tired])) 


•  uO,  til 
©eO,el 


]  **el]) 


Next,  the  deindexicalization  of  the  term  ( That  . . .),  with  the  same  tense  tree,  generates 
a  new  single-node  tree  “embedded”  by  the  given  tree  (note  the  horizontal  “embedding 
link”): 

(18)  ...  (That  (subpast  [Mary  tired])  q  i  )... 

eO.el  el 


This  is  done  by  a  That-rule,  whose  details  we  omit.  Note  that  the  root  of  the  new  tree 
becomes  the  new  focus,  and  that  the  previous  Reft,  i.e.,  el,  has  been  copied  to  that 
focus.  Though  we  are  now  strictly  dealing  with  two  trees  connected  by  an  embedding 
link,  we  shall  still  refer  to  them  jointly  as  “the”  tense  tree. 

Next,  one  of  the  rules  for  Subpast  is  applied: 

Subpast(l) :  For  T  focused  at  a  past-dominated  node,  i.e.,  one  with  a  leftward 
branch  in  its  ancestry  (where  ancestry  also  passes  through  embed¬ 
ding  links): 

(subpast  $)t  <-*•  (3ex:[eT  ends-during  Reft]  [$ot  **  el]) 

Note  that  this  interprets  the  past-embedded  episode  as  overlapping  the  embedding 


22  A  slightly  more  complex  rule,  allowing  successive  arguments  to  change  the  context,  is  needed  for  the 
general  case  of  nonatomic  arguments. 
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episode.  (Subpast (2)  is  for  T  focused  at  a  non- past-dominated  node,  which  interprets 
the  past -embedded  episode  as  occurring  before  the  embedding  episode.)  For  the  formula 
( subpast  . . .)  within  (18),  the  result  is 

(19)  . . .  (3e2:[e2  ends-during  el]  [[Mary  tired]  **e2])  . . . 

y® 

eO.el  el,e2 

Note  that  though  e2  (Mary’s  being  tired)  is  predicated  to  end  during  el  (John’s  inferring 
it),  it  may  well  be  an  initial  segment  of  a  much  longer  episode  of  the  same  type;  i.e., 
Mary  may  continue  to  be  tired.23  It  remains  to  apply  the  Pred-rule  once  more,  and  to 
interpret  Mary  relative  to  the  tense  tree  as  just  Mary  (a  nonindexical  constant),  with 
overall  result: 

(20)  (3el:[[el  before  ul]  A  [cl  successor-ep  e0]] 

[[John  infer  (That  (3e2:[c2  ends-during  el]  [[Mary  tired]  **e2]))]  **el]) 

Thus  we  have  a  fully  context-independent  representation  of  (11),  which  can  be  used 
freely  for  inference  (including  narrative  inferences  based  on  the  successor-ep  relation). 

There  are  some  eight  rules  altogether  for  deindexicalizing  tense,  future  modality, 
and  perfective  and  progressive  aspect,  and  additional  rules  to  deal  with  other  constructs 
of  our  logic.  We  must  leave  most  details  to  another  paper  oriented  toward  these  linguis¬ 
tic  issues  (Schubert  and  Hwang,  to  appear),  but  will  briefly  consider  perfective  aspect, 
conjunctions,  quantification,  and,  at  slightly  greater  length,  adverbials. 

Perfective 

Syntactically,  the  perfective  is  handled  straightforwardly  through  the  auxiliary  VP 

rule, 

VP  -  V[aux]  ADVL[pre-VP]"  ADV[not]~  XP;  (ADVL'  Ax(ADV'  [x  (V'  XP')])),24 
and  lexical  rules  such  as 

V[aux,  pres,  3per,  sing,  _VP[-en]]  — ♦  has ;  APAx<pres  (perfi  [x  P])>, 

V[aux,  past,  pers,  numb,  _VP[-enj]  — ►  had ;  APAx<past  (perfi  ix  P])>, 

V[aux,  past,  pers,  numb,  _VP[-en]]  — >  had\  APAx<past  (perf2  [x  P])>. 

Note  that  we  distinguish  two  perfective  operators  in  the  semantics,  perfi  and  perf2.  As 
will  be  seen,  the  interpretation  of  perfi  is  sensitive  to  the  stative/nonstative  distinction, 

23On  the  other  hand,  if  the  embedded  episode  were  non-stative ,  as  in  “Mary  noticed  that  John  winked 
at  her),”  it  would  indeed  be  prevented  from  extending  beyond  the  embedding  episode. 

24  “minus”-8uperscripts  indicate  optional  constituents;  where  such  constituents  are  not  present,  the 
correct  semantic  rule  is  obtained  by  replacing  their  translations  by  the  identity  operator,  APP.  The  rule 
is  intended  to  cover  all  auxiliaries  preceding  the  main  verb,  as  well  as  copular  be,  i.e.,  V[be,  _XP[pred]] 
(so  that  the  earlier  rule  G  now  becomes  redundant).  Feature  constraints  such  as  that  if  V[aux)  is 
the  perfective  have,  then  XP  must  be  a  VP  with  feature  -en,  are  assumed  to  be  enforced  through 
subcategorization  features  on  the  V[aux],  such  as  V[aux,  _VP[-en]]. 
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while  that  of  perf2  is  not.  The  two  operators  are  introduced  to  account  for  certain 
differences  between  present  and  past  perfect  (not  discussed  further  here),  discernible 
in  such  contrasts  as  “*John  has  left  yesterday”  versus  “John  had  left  the  day  before”; 
“* John  has  been  sleeping  a  minute  ago”  versus  “John  had  been  sleeping  a  minute  before” 
( cf Bennett  and  Partee  1978).  Deindexicalization  of  per/i  is  carried  out  in  accordance 
with  the  following  rule:25 

Perf(l):  [(perfi  $)t  **  rj\  <-+  [[[^  same-time  Ref^]  A 

(3eT:[eT  until  r 7]  [$0lT  @  eT])]  **  (F  17)], 

where  Ref^  denotes  the  predecessor  of  Refy  at  the  focus  of  T,  ([FJ  =  \  i.e.,  (F  r/)  means 
“the  factual  content  of  situation  r?,”  and  [$  @  e]  is  understood  as  equivalent  to  *  e] 
(4>  holds  throughout  e)  for  stative  $  and  to  [(3ej :[ej  subep  e][$  *  eij)  *  (F  e)\  ($  holds 
sometime  during  e )  for  nonstative  $  (as  per  clause  18  of  the  formal  semantics  in  the 
previous  section). 

We  think  that  this  rule,  together  with  a  similar  one  for  perf2  ,  solves  a  number 
of  problems  in  the  interaction  of  perfectives  with  tense,  adverbials,  and  the  aspectual 
class  of  the  complement  (without  resorting  to  separate  methods  for  the  various  forms  of 
tensed  and  untensed  perfective,  perfect  progressive,  etc.,  as  is  often  done  in  computa¬ 
tional  linguistics.)  One  we  should  go  into  briefly  is  that  of  accounting  for  the  contrast 
between  sentence  pairs  like 

(21)  John  has  been  ill ; 

(22)  John  has  become  well. 

In  (21),  the  episode  described  by  the  ftaue-complement,  John’s  being  ill,  extends  to 
the  (speaker’s)  present  in  the  preferred  reading.26  In  (22),  however,  the  corresponding 
episode,  John’s  becoming  well,  precedes  the  present.  This  is  accounted  for  by  the  above 
stipulation  that  for  stative  $,  [$  @  e?]  requires  that  the  state  described  by  $  occupy  the 
entire  temporal  extent  of  ej,  whereas  for  nonstative  $,  the  episode  described  by  $  need 
only  occupy  a  temporal  segment  of  ej.  Thus,  since  Perf  also  specifies  that  ej  lasts  until 
the  (new)  reference  episode  77,  stative  “goings-on”  extend  all  the  way  to  the  reference 
time  while  nonstative  ones  need  not.  Also,  Perf,  in  conjunction  with  our  tense  tree 
mechanism,  supplies  properly  aligned  reference  times  in  examples  like  “Mary  looked  at 

25 For  those  acquainted  with  Reichenbach’s  theory  of  tense,  old  episode  Ref  j  and  new  episode  rj  both 
lie  at  the  “reference  time”  ;  e-j-  temporally  contains  the  “event  time,”  but  extends  all  the  way  to  the 
reference  time.  No  “speech  time”  is  involved  in  perfective,  per  se,  according  to  our  treatment. 

26 In  the  non-preferred  reading,  John’s  episode  of  being  ill  strictly  precedes  the  present.  We  get  such 
a  reading  by  allowing  a  “lexical  extension  rule”  to  be  applied  to  any  lexical  verb  marked  as  stative, 
transforming  it  into  a  (less  readily  available)  nonstative  verb.  The  corresponding  semantic  transformation 
applies  an  operator  start-do-slop  to  the  logical  translation,  yielding  such  nonstative  predicates  as  (start- 
do-stop  ill).  We  think  there  is  independent  evidence  for  availability  of  (nonpreferred)  nonstative  readings 
for  all  stative  verbs,  such  as  the  possibility  of  existential  (nongeneric)  readings  of  sentences  such  as 
“Vacationers  are  on  the  beach,”  “Poor  people  live  in  those  row  houses,”  or  “Criminals  own  assault 
rifles.” 
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the  picture;  John  had  painted  it  twelve  years  ago”  ;  “John  thinks  that  Mary  has  left  ” ; 
and  “John  inferred  that  Mary  had  left.  ” 

Conjunctions 

Concerning  deindexicalization  of  conjunctions,  the  most  straightforward  examples 
are  ones  like  “John  took  the  money  and  left,”  where  a  tensed  verb  heads  each  conjunct. 
After  scoping  of  the  connective  (and  other  operators),  the  following  rule  can  be  applied: 

A (1)  :  [$  A  tf]T  [$t  A  [¥*.T]. 

Roughly  speaking,  $  •  T  is  the  tense  tree  resulting  from  adding  $  to  a  discourse  context 
with  initial  tense  tree  T.  In  this  way,  $  is  processed  relative  to  a  context  structure  which 
has  already  been  modified  by  $,  as  is  necessary  for  making  explicit  any  implicit  temporal 
relations.  Thus,  the  “•”  function  describes  how  the  tense  tree  changes.  It  can  be  specified 
recursively  with  such  rules  as 

(past  $)  •  T  =  t  ($  •  (o/ T)) , 

(perfi  $)-T  =  |($  •  (oiT)) , 

(subpasti  4>)  •  T  =  $  ■  (oT) , 

where  T  T  is  the  result  of  moving  the  focus  of  T  upward  one  step.  However,  details  of  this 
need  not  detain  us  here.  In  essence,  ($  -T)  yields  the  tree  one  would  expect  if  the  shift 
and  store  operations  specified  in  the  deindexicalization  rules  were  carried  out  on  a  global 
data  structure  (and  the  focus  reset  to  the  node  originally  focused  in  T). 

A  more  subtle  case  of  conjunction  is  the  one  in  which  there  is  just  one  tensed 
verb,  as  in  “John  lost  his  job  and  his  wife.”  Assuming  that  tense  receives  widest  scope, 
there  is  just  one  tense  operator,  namely  past,  and  this  leads  to  a  logical  form  contain¬ 
ing  a  conjunctive  formula  characterizing  an  episode,  of  the  sort  [[$  A  \P]  **  77].  The 
deindexicalization  rule  in  this  case  is  roughly 

A  (2) :  [[$  A  $]  **  t?]t  <-*  [[(3eT:[cT  subep  77]  [$0T  *  <?t])  A 

(3eT.:[eT»  subep  rj\  [$OT-  *  eT.])]  **  (F  77)], 

where  T'  =  $*(oT).  We  have  suppressed  some  slight  complications  having  to  do  with 
the  fact  that  ej  and  ej/  need  to  be  stored  in  the  tense  tree  as  episodes  “subordinate”  to 
77,  rather  than  on  a  par  with  it.  (This  can  be  done  with  operators  $  and  for  opening 
and  closing  subepisode  lists.)  The  main  point  we  are  making  here  is  that  conjunction 
introduces  subepisodes,  enabling  other  rules  (not  shown  here)  to  make  any  implicit  tem¬ 
poral  relations  between  the  subepisodes  explicit. 

Quantifiers 

There  is  a  similar  phenomenon  for  sentences  involving  quantifiers.  For  instance,  in 
“Everyone  congratulated  Mary,”  we  can  intuitively  see  that  there  is  not  only  an  “over¬ 
all”  episode  covering  all  the  congratulations,  but  also  a  set  of  individual  congratulation 
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episodes,  one  for  each  person  in  question.  In  “Everyone  congratulated  Mary,  which  made 
her  happy,”  the  (preferred)  cause  of  the  happiness  episode  is  the  “overall”  congratula¬ 
tory  episode.  But  in  “Everyone  who  met  Mary  immediately  congratulated  her,”  the 
(preferred)  reading  involves  temporal  proximity  of  individual  meeting  events  and  con¬ 
gratulations.  The  following  rule  for  monotonic  increasing  quantifiers  (based  on  semantic 
rules  for  quantifiers,  12  and  13)  introduces  such  individual  episodes  (neglecting  certain 
refinements  having  to  do  with  surface  order): 

Quant :  For  Q  £  {V,  Most,  Many,  A-Few,  . . . }  : 

[(Qa:<I>'Ir)T  **  rj\  «-*•  [(Qa:(3e:[e  during  *  e]) 

(3eT':[eT'  subep  77]  [[$T  k  'Fqt']  *  eT']))  **  (F  77)], 

where  T'  =  $  T.  Again,  ey'  should  actually  be  stored  as  an  episode  subordinate  to  77 
(using  $  and  <(:).  On  the  other  hand,  the  episode  e  corresponding  to  (an  instance  of)  the 
quantifier  restriction  is  not  stored  in  the  tense  tree  at  all,  according  to  this  rule.  This  is 
intended  to  account  for  the  intuition  that  NPs  do  not  uniformly  provide  episodic  refer¬ 
ents  in  the  way  VPs  do.  (For  instance,  the  referent  of  it  is  much  less  apparent  in  “Most 
fathers  enjoy  it”  than  in  “John  is  a  father  and  enjoys  it.”)  Of  course,  a  noun  phrase  like 
everyone  who  met  Mary  introduces  an  episodic  referent  through  the  relative  clause,  and 
this  referent  can  be  related  to  the  main-clause  episode  ej  through  tense,  aspect,  and 
adverbials. 

Adverbials 

Our  grammar  fragment  also  handles  various  adverbs  and  NP/PP-adverbials  in 
combination  with  tense  and  aspect  and  negations.  We  take  adverbials  to  be  uniformly 
VP-adverbials  at  the  level  of  syntax.  (Initial  adverbials  in  sentences  like  “Yesterday 
John  left”  are  treated  as  topicalized.)  The  following  are  some  of  the  main  rules  for 
such  adverbials,  with  some  relevant  lexical  rules.  Note  that  adverbials  are  divided  into 
pre-VP  and  post-VP  adverbials,  and  those  that  can  be  either,  with  feature  mod-VP.  In 
the  translation  of  adverbials,  we  use  operators  like  adv-e  and  adv-f  that  turn  a  1-place 
predicate  over  episodes  into  a  sentence  modifier,  adv-a  that  turns  a  1-place  predicate 
over  actions  into  a  predicate  modifier,  and  adv-m  that  turns  a  1-place  predicate  into 
a  predicate  modifier.27  Meaning  postulates  later  apply  the  predicates  which  are  the 
arguments  of  these  operators  to  episodes  and  actions,  respectively. 

VP  Adjuncts 

VP  -»  ADVL[pre-VP]  VP;  (ADVI/  VP') 

VP  —  VP  ADVL[post-VP];  (ADVI/  VP') 

Adverbs 


27 adv-e,  adv-f,  adv-a,  and  adv-m  st  .;id  for  episode-modifying  adverbial,  /requency  adverbial,  action¬ 
modifying  adverbial,  and  manner  adverbial,  respectively. 
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ADVL  -  ADV^not] ;  ADV' 

ADV[pre-VP]  — ♦  certainly;  APAx((-ly  certain)  [x  P]) 
ADV[mod-VP]  — ►  soundly;  (adv-m  sound) 

ADV[mod-VP]  — »  frequently;  APAx((adv-f  frequent)  [x  P]) 

NP/PP  Adverbials 

ADVL[post-VP]  — >  NP[time];  APAx((adv-e  (during  NP'))[x  P]) 
NP[time]  — ►  yesterday;  yesterday 

ADVL[post-VP]  — * •  PP[ep-mod] ;  APAx((adv-e  PP')  [x  P]) 

PP  -  P  NP ;  (P'  NP') 

P[ep-mod]  — ►  for;  lasts28 


As  an  illustration  of  the  effect  of  the  rules  shown  so  far,  here  is  the  initial  translation 
of  a  sentence  with  four  adverbials  (braces  indicate  assumed  phrase  structure): 

(23)  {John  {certainly  {{{slept  soundly }  for  eight  hours }  yesterday }}} 

=>■  ((-ly  certain)  ((adv-e  (during  yesterday))  ((adv-e  (lasts  (K1  (eight  hour)))) 
[John  ((adv-m  sound)  <past  sleep>)]))). 

This  is  easily  verified.  The  unscoped  pres  operator  now  needs  to  be  “raised”  to  some 
sentential  level.  There  are  several  possibilities,  but  the  preferred  scoping  is: 

(24)  ((-ly  certain) 

(past  ((adv-e  (during  yesterday)) 

((adv-e  (lasts  (K1  (eight  hour)))) 

[John  ((adv-m  sound)  sleep)])))). 

Next,  the  following  are  some  additional  deindexicalization  rules  adequate  for  complete 
deindexicalization  of  (24). 

Deindexicalization  Rules 

1.  For  o  a  modal  operator  and  $  a  formula: 

(a  <I>)t  (a  $t) 

2.  For  7T,II  non-modal  1-place  predicates  and  $  a  formula: 

((adv-e  7r )  $)t  <-*•  ((adv-e  ttj)  SVt) 

((adv-m  7r )  II)t  «-*•  ((adv-m  7tt)  II^-t) 

We  also  have  rules  for  deindexicalizing  indexical  terms,  e.g.,  yesterdayx  =  (yesterday- 
rel-to  Nowt),  where  Nowt  is  the  most  recently  added  episode  at  the  root  node  of 
T. 

Applying  the  above  rules  (together  with  the  Past-rule)  to  (24),  we  get  the  following 
completely  deindexicalized  formula: 

28E.g.,  (lasts  (K1  (one  hour))),  which  means  “for  one  hour.” 
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(25)  ((-ly  certain) 

(3el:[el  before  ul] 

[((adv-e  (during  (yesterday-rel-to  ul))) 

((adv-e  (lasts  (Kl  (eight  hour)))) 

[John  ((adv-m  sound)  sleep)]))  **  el])), 

where  ul  is  the  utterance  of  sentence  (23). 

Finally  here  are  some  meaning  postulates  that  could  be  applied  to  (25).  We 
implicitly  take  these  to  be  necessitated  (i.e.,  prefixed  with  □). 

Meaning  Postulates 


1.  For  a  1-place  modal  predicate  and  $  a  formula: 

((-h  jr)  $)  ~  [(That  $)  tt] 

2.  For  t,  t)  terms,  ?r ,  IT  predicates,  $  a  formula,  and  op  £  {*,  **}: 

[((adv-e  rr)  $)  op  rj\  <-+  [[[;?  tt]  A  [$  *  t?]]  op  (F  rj)]29 

[[r  ((adv-m  7r)  II)]  op  p]  <-*•  [[[[r  |  rj\  (in-manner  tt)]  A  [[r  II]  *  77]]  op  (F  77)]30 
Applying  these  MPs  to  (25),  we  get  the  final  representation: 

(26)  [(That  (3el:[el  before  ul] 

[[[el  during  (yesterday-rel-to  ul)]  A 
[[[el  lasts  (Kl  (eight  hour))]  A 
[[[[John  |  el]  (in-manner  sound)]  A 
[[John  sleep]  *  el]]  *  (F  el)]]  *  (F  el)]]  **  (F  el)])) 

certain], 

which  may  be  further  simplified  into31 

(27)  [(That  (3el:[el  before  ul] 

[[[el  during  (yesterday-rel-to  ul)]  A 

29E.g.,  tt  =  Ae[e  lasts  (Kl  (two  hour))],  Ae[e  during  yesterday],  Ae[e  in  Central-Park],  . . . , 
or,  equivalently,  (lasts  (Kl  (two  hour))),  (during  yesterday),  (in  Central-Park),  ... 

Actually,  this  rule  depends  on  certain  “aspectual  constraints”  being  satisfied:  state  operators  must  be 
applied  only  to  stative  formulas.  E.g.,  a  durative  operator  like  (adv-e  (lasts  (Kl  (two  hour))))  is  a  state 
operator,  and  in  this  rule,  cannot  be  applied  to  a  non-stative  formula  like  [John  sneeze]. 

30 E.g.,  7r  =  forceful  (“John  forcefully  expressed  his  opinion”), 

light  (“John  kissed  Mary  lightly  on  the  cheek”),  . . . 

Note:  in-manner  is  a  predicate  modifier  that  turns  a  predicate  over  ordinary  individuals  into  a  predicate 
applicable  to  actions. 

31  The  following  axiom  schemas  are  used  in  the  simplification  process. 

For  $,  ^  atemporal  formulas,  rj  a  term,  and  op  £  {♦,  **}: 

[[$  A  '!']  op  77]  — >  [[[$  *  »j]  A  ['I'  *  »?]]  op  77] 
iN>  *  v]  op  77]  ~  [<i>  op  77] 
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[[el  lasts  (K1  (eight  hour))]  A 
[[[John  |  el]  (in-manner  sound)]  A 
[(John  sleep]  *  el]]]]  **  (F  el)])) 

certain]. 


5  Making  Inferences  with  Episodic  Logic 

Our  inference  rules  fall  into  three  broad  categories,  namely,  (i)  basic  inference  rules 
including  rule  instantiation  and  its  dual  goal  reduction,  (ii)  narrative  inference  rules 
such  as  causal  connection,  temporal  succession  and  state  persistence,  and  (iii)  simulative 
inference  rules.  All  of  the  rules  allow  for  epistemic  probability  bounds  on  premises  and 
for  use  of  generic  conditionals. 

We  have  already  briefly  described  Rule  Instantiation  (RI)  which  is  heavily 
used  in  input-driven  inference;  its  dual  Goal  Chaining  (GC)  similarly  dominates  goal- 
driven  inference.  We  have  indicated  that  RI  has  modus  ponens  as  special  case,  but  also 
allows  instantiation  of  generic  conditionals.  In  fact,  it  allows  arbitrarily  many  “minor 
premises”  to  be  matched  against  arbitrarily  deeply  embedded  subformulas  of  a  rule. 
(Apart  from  its  avoidance  of  skolemization,  it  resembles  Andrews’  general  matings  ( 1981) 
and  Bibel’s  connections  (1979).)  Schematically,  the  rule  is 

RI  (Rule  Instantiation) 

-  Non-probabilistic  version  - 

For  $m,  -’S'l,...,  ->«{,,  formulas  with 

bound  variables  standardized  apart,  and  with  all  $,’s  occurring  negatively  in 
/?($!....,  . . . ,  and  all  $'-’s  occurring  positively  in  it: 

R(*  1 . $m,  $J,...,$'n) 

*1 . . -»n 

-R<r(T,  ...,T,1,...,J_) 

where  substitution  cr  unifies  the  with  corresponding  S',  and  with  corre¬ 
sponding  R„( T, . . . ,  T,  ±, . . . ,  1)  is  then  simplified  to  eliminate  the  truth 
values  T  and  _L. 

The  substitution  o  applies  to  certain  matchable  variables  which  are  V-quantified 
by  a  positively  occurring  quantifier,  or  3-quantified  by  a  negatively  occurring  quanti¬ 
fier,  in  /Z($i, ..., ••  -»$n)  or  one  °f  A  subformula  occurs  positively  if 

it  lies  within  an  even  number  of  negations,  where  conditional  antecedents,  and  V-- 
quantifier  restrictions  count  as  negation,  and  similarly,  for  “negatively  occurring.”  Com¬ 
puting  jR<t(T,  . . . ,  T,  1, . . . ,  1)  involves  elimination  of  quantifiers  of  variables  replaced  by 
<7,  e.g.,  if  b  is  substituted  for  x,  then  (Vx:$$  );,/*  becomes  [<£; ,/x  — ►  A  probabilistic 
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version  of  RI  results  when  the  \P,  or  ->5''  are  allowed  to  have  non-unit  lower  epistemic 
probabilities  and/or  72  is  a  generic  conditional. 

A  rule  instantiation  typically  instantiates  the  complete  antecedent  of  a  rule,  and 
infers  the  particularized  consequent.  However,  it  may  only  match  part  of  the  antecedent, 
or  match  part  or  all  of  the  consequent  (giving  a  “contrapositive”  inference).  For  example, 
consider  rule 

(Vx[[x  whale]  — ►  ->[x  fish]]), 

i.e.,  no  whale  is  a  fish.  Here  £  is  a  matchable  variable  since  it  is  quantified  by  a  posi¬ 
tively  occurring  V-quantifier.  [x  whale]  and  [x  fish],  called,  say,  $1  and  $2,  occur  nega¬ 
tively,  while  their  negations  occur  positively.  Suppose  now  we  have  the  assertion  [Moby- 
Dick  whale]  in  the  knowledge  base.  Then,  substitution  Moby-Dick/x  unifies  [Moby-Dick 
whale]  with  $1,  with  result 

T  — *  ->[Moby-Dick  fish], 

which  is  simplified  to  ->[Moby-Dick  fish].  Conversely,  if  we  have  [Wanda  fish]  in  the 
knowledge  base,  unification  with  $2  yields  the  inference  [Wanda  whale]  — >  -iT,  i.e., 
-■[Wanda  whale]. 

As  another  example,  consider  the  rule 

(Vx[[x  large-carnivore]  — ►  [(3e[[x  hungry]  **  e]) 

—  (3e'[[e'  sametime  e]  A  [[x  dangerous]  **  e']] )]] ), 

i.e.,  Every  large  carnivore  is  dangerous  whenever  it  is  hungry.  Here  x  and  e  are  matchable. 
x  being  quantified  by  a  positively  occurring  V-quantifier,  and  e  by  a  negatively  occurring 
3-quantifier.  Also,  note  that  [x  large-carnivore]  and  [[x  hungry]  **  e],  called,  say, 
and  $2,  occur  negatively  in  this  rule  while  [e  sametime  e']  and  [[x  dangerous]  **  c'] 
occur  positively.  Suppose  now’  we  get  input  [Cll  wolf]  and  [[C 1 1  hungry]  **  E7],  Using 
additional  facts  from  knowledge  base,  in  this  case,  hierarchical  knowledge  implicit  in 
(type)  specialists,  we  can  infer  [Cll  large-carnivore]  from  [Cll  wolf].  Then,  substitution 
{Cll/x,  E7/e}  unifies  [Cll  large-carnivore]  and  [[Cll  hungry]  **  E7]  with  and  $21 
with  result 


T  — ►  [T  — ►  (3e'[[e'  sametime  E7]  A  [[Cll  dangerous]  **  e']])], 

which  is  simplified  to  (3e'[[e'  sametime  E7]  A  [[Cll  dangerous]  **  e']]).  This  process 
amounts  to  making  the  inference  “The  wolf  is  dangerous  at  the  time  of  E7  (the  episode 
of  being  hungry).” 

The  generalization  of  RI  we  are  using  for  probabilistic  inference  is  essentially  the 
same  as  the  nonprobabilistic  version,  with  the  antecedent  of  a  generic  conditional  count¬ 
ing  as  a  negative  environment,  ?nd  the  consequent  as  positive.  Additional  rules  of  sim¬ 
plification  are  needed,  including:  T9  — $  becomes  $P9,  X’  —*p  $  becomes  T,  $  —>p  T9 
becomes  T,  and  $  — ±9  becomes  (-i$)P9.  (These  rules  are  justifiable  in  terms  of  prob¬ 
ability  theory  for  q  —  1  or  0,  but  for  intermediate  q  they  merely  reflect  our  intuitions 
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at  this  point.)  As  an  illustration,  if  the  two  conditionals  in  the  “large-carnivore”  exam¬ 
ple  above  had  weights  p  and  q,  the  previous  conclusion  would  be  obtained  with  lower 
probability  pq. 

In  practice,  RI  is  implemented  roughly  as  follows.  A  newly  inferred  conclusion, 
corresponding  to  one  of  the  or  is  used  to  index  to  the  rule  R.  An  initial 

determination  is  th.n  made  whether  the  instantiation  is  likely  to  succeed  and  yield  a 
useful  result.  If  the  decision  is  to  instantiate,  then  the  attempt  to  do  so  is  performed 
by  a  recursive  algorithm  applied  to  R,  which  actively  seeks  to  find  appropriate  and 
-i'p'  instances  in  the  knowledge  base  to  unify  with  negatively  and  positively  occurring 
subformulas  of  R.  Actually,  as  was  indicated  in  the  second  illustration,  the  and 
-i’PJ  need  not  even  occur  explicitly  in  the  knowledge  base.  They  may  be  inferred  by 
specialists  for  type  taxonomies,  temporal  relations,  or  other  special  classes  of  relations, 
or  by  a  limited  amount  of  Prolog-like  backchaining.  Before  turning  to  goal-directed 
inference  rules,  we  should  mention  A-conversion  and  substitution  of  equals  for  equals  as 
further  deductive  rules  available  in  our  logic. 

For  goal-directed  inference  (e.g.,  in  response  to  questions)  two  general  methods 
are  available.  The  first,  Goal  Chaining  (GC),  is  the  dual  of  RI.  For  comprehensibility, 
we  state  only  a  special  (but  frequently  encountered)  case. 

GC  (Goal  Chaining) 

For  R{$),  %  standardized  formulas  where  $  is  a  positively  occurring  subfor¬ 
mula  of  f?($): 


i?($),  $ 

-£< rU) 

w’here  a  differs  from  a  (in  RI)  in  ’hat  it  treats  variables  of  $  with  positively 
occurring  3-quantifiers  or  negatively  occurring  V-quantifiers  as  matchable. 

Like  RI,  this  is  a  very  general  chaining  rule,  allowing  not  only  chaining  from  rule  conse¬ 
quents  to  antecedents,  but  from  any  positively  occurring  subformula  to  the  rest  of  R($) 
(negated  and  suitably  instantiated). 

For  example,  consider  a  rule  R, 

(3x[x  wolf])  — »  [[x  meat-eater]  A  [x  forest-dweller]], 

and  a  goal 

(3ii[xi  forest-dweller]). 

In  R,  subformulas  [x  meat-eater]  and  [x  forest-dweller]  occur  positively;  in  4',  x\  is 
matchable  as  it  is  quantify  by  a  positively  occurring  3-quantifier.  Thus,  via  substitution 
ii/x,  we  can  unify  [xi  forest-dweller]  with  [x  forest-dweller],  and  get 

-i((3x[x  wolf])  — ►  [[x  meat-eater]  A  _L]), 
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which  is  simplified  to  ->((3x[x  wolf])— ►  -L),  i.e.,  (3x[x  wolf]).  This  process  amounts  to 
reducing  the  question  “Is  there  a  forest-dweller?”  to  “Is  there  a  wolf?,”  using  knowledge 
“A  wolf  is  a  meat  eater  and  a  forest  dweller.” 

As  another  example,  consider  rule  R, 

(3x[x  wolf])  -+  [[x  gray]  V  [x  black]], 


and  a  goal  'f, 

(3xr [xj  gray]). 

In  R,  subformulas  [x  gray]  and  [x  black]  occur  positively;  in  'P,  X\  is  matchable  much  as 
before.  Thus,  via  substitution  xj./x,  we  can  unify  [xi  gray]  with  [x  gray],  and  get 

-i((3x[x  wolf])  — ►  ([x  black]  V  1]), 

which  is  equivalent  to  -i((3x[x  wolf])-*  [x  black]),  i.e.,  (3x[[x  wolf]  A  ->[x  black]]).  This 
process  amounts  to  reducing  the  question  “Is  there  a  gray  thing?”  to  “Is  there  a  wolf 
that  is  not  black?,”  using  knowledge  “A  wolf  is  either  gray  or  black.” 

The  general  version  of  GC,  like  the  general  RI  rule,  allows  arbitrarily  many  sub¬ 
sidiary  knowledge  base  facts  to  be  invoked  in  the  process  of  chaining  from  the  given  goal 
to  a  subgoal.  For  instance,  in  the  second  example  above,  knowledge  base  fact  [Cll  wolf]. 
— >[C1 1  black]  would  have  led  to  immediate  success  (i.e.,  subgoal  T).  Also,  probabilities 
are  handled  much  as  in  RI.  (A  subgoal  is  interpreted  as  meaning  that  if  $  can  be 
proved  with  probability  <7,  then  the  original  goal  is  established  with  probability  pq.) 

The  second  class  of  goal-directed  methods  (not  yet  implemented)  consists  of  stan¬ 
dard  natural  deduction  rules  such  as  proving  a  conditional  by  assuming  the  antecedent 
and  deriving  the  consequent;  or  proving  a  negative  formula  by  assuming  the  positive 
and  deriving  a  contradiction;  or  proving  a  universal  by  proving  an  “arbitrary  instance” 
of  it.  Such  rules  are  needed  for  completeness,  since  goal  chaining  cannot  prove  valid 
formulas  such  as  $  — ♦  $.  An  interesting  future  possibility,  in  the  case  of  proofs  involving 
assumption-making,  is  to  activate  input-driven  inferencing  (primarily,  RI)  once  an  as¬ 
sumption  has  been  made,  so  that  its  important  consequences  will  be  worked  out,  making 
it  easier  to  complete  the  goal-directed  proof. 

In  our  implementation,  we  use  a  sophisticated  agenda-driven  control  structure  for 
goal  chaining  (largely  borrowed  from  EcONET  -  see  de  Haan  &  Schubert  1986)  with 
goals  ranked  according  to  estimated  difficulty  and  new  knowledge  accessed  via  concept 
and  topic  hierarchies.  The  aim  here  is  not  so  much  theorem  proving  power  per  se,  but 
the  ability  to  get  at  the  relevant  knowledge  in  a  large  knowledge  base.  Despite  the 
lack  of  natural  deduction  rules,  all  the  examples  which  ran  in  the  original  resolution- 
based  system  are  handled  by  Ecologic  (in  addition  to  new  examples  involving  **, 
A-abstraction,  and  so  on). 

A  remaining  problem  is,  of  course,  the  principled  handling  of  probabilities.  The 
state  of  the  art  in  probabilistic  inference  (e.g.,  Pearl  1988,  Bacchus  1988a)  is  not  such  as  to 
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provide  concrete  technical  tools  for  a  logic  as  general  as  episodic  logic.  We  are,  however, 
successfully  using  a  “noncircularity  principle”  which  prevents  the  same  knowledge  from 
being  used  twice  to  “boost”  the  probability  of  a  particular  conclusion.  This  is  done  by 
keeping  track  of  the  support  set  in  a  probabilistic  inference  process.  Apart  from  this,  we 
use  independence  assumptions  where  there  are  no  known  dependencies,  and  manipulate 
lower  probabilities  in  accord  with  the  laws  of  probability. 

Finally,  we  mention  some  narrative  and  simulative  inference  rules  (yet  to  be  im¬ 
plemented). 


CC  (Causal  Connection) 

[■ r f  successor-ep  7 7],  [77  event],  [rf  event],  -»(3z[[ar  |  17']  volitional-action]) 

[77  cause-of  t?']'5 

For  example,  given  a  fragment  “John  greeted  Mary;  Mary  was  startled,”  we  conclude 
that  John’s  greeting  is  the  cause  of  Mary’s  being  startled,  with  minimal  degree  of  belief 
.6.  (Note  that  since  the  conclusion  is  probabilistically  qualified,  it  can  be  overridden  by 
other  considerations,  such  as  plausibility  or  other  “coherence”  relations.  For  instance, 
in  “The  sun  set;  the  moon  rose,”  we  would  want  the  causal  inference  to  be  overridden.) 
We  should  remark  that  we  are  in  the  process  of  reformulating  narrative  inference  rules 
as  generic  conditionals  to  be  used  in  the  same  way  as  narrative  domain  knowledge. 
For  example,  the  above  rule  can  be  reformulated  as  a  generic  conditional  with  77  and  i)' 
replaced  by  existentially  quantified  variables,  and  the  generic  conditional  carrying  weight 
.6.  (More  accurately,  all  of  the  premises  and  the  conclusion  should  be  embedded  within 
performative  predications,  such  as  “text-source  asserts  that  ...”.) 

An  example  of  a  simulative  inference  rule  is: 


SIM  (Simulative  Reasoning) 

For  r,  an  individual  term;  7r  €  {learn,  discover,  remember,  find,  realize  . . .}; 

$  and  ty,  formulas;  77,  an  episodic  term;  /C,  a  set  of  facts  which  are  “common 
knowledge”  (shared  by  the  reasoner  and  r): 

[[r  tr  (That  $)]  **  77] 

_ $  ¥ _ 

(3e:[[e  right-after  77]  A  [77  cause-of  e]] 

[[r  infer  (That  $)]  **  e]) 

where  l~/c  means  “follows  automatically  by  input-driven  inference,  with  use 
of  (only)  the  additional  premises  1C  in  the  knowledge  base.” 

Note  that  in  this  simulative  reasoning  rule,  the  question  answerer’s  own  ability  to  infer 
from  $  via  common  knowledge  is  being  attributed  to  r,  the  agent  in  the  antecedent. 
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6  Episodic  Logic  and  Narrative  Understanding 


In  outline  etory  understanding  on  our  view  involves  the  lollowing  interleaved  steps  for 
each  new  sentence:  (i)  parsing  and  initial  logical  translation;  (ii)  disambiguation  (includ¬ 
ing  quantifier  scoping  and  anaphora  resolution);  (iii)  deindexicalization;  (iv)  application 
of  all  three  types  of  inference  rules  to  the  deindexicalized  translated  input,  in  combina¬ 
tion  with  stored  knowledge  (meaning  postulates,  generic  conditionals,  and  other  general 
and  specific  knowledge);  among  other  things,  this  may  generate  new  predictions  and 
explanations;  (v)  matching  of  previous  predictions  and  explanations  with  new  ones. 

We  have  put  these  ideas  to  the  test  in  two  ways:  first,  by  hand-simulating  the 
inference  process  for  a  small  fragment  of  the  story  of  Little  Red  Riding  Hood.  We  also 
have  a  pr  otype  implementation  which  accepts  logical-form  inputs  and  performs  many 
of  the  inlc.ences  we  have  alluded  to  and  is  able  to  answer  simple  questions  (Schubert  et 
al.  1989).  In  this  section,  we  will  show  how  our  logic  would  allow  the  system  to  account 
for  the  wolf’s  decision  not  to  eat  Little  Red  Riding  Hood  right  away  when  he  first  met 
her,  given  a  brief  excerpt  from  the  story  as  follows. 

In  the  forest,  Little  Red  Riding  Hood  met  a  wolf. 

The  wolf  would  have  very  much  liked  to  eat  her,  but 
he  dared  not  do  so  on  account  of  some  woodcutters  nearby. 

Processing  this  fragment  requires  extensive  reasoning  including  inferences  based  on 
meaning  postulates,  predictive  inferences,  explanatory  inferences  and  simulative  infer¬ 
ences.  For  example,  to  understand  the  third  sentence,  one  should  be  able  to  explain 
why  the  wolf  decided  against  eating  Little  Red  Riding  Hood,  and  how  the  presence  of 
woodcutters  nearby  affected  the  wolf’s  decision.  So,  one  has  to  know  that  when  some 
agent  dares  not  do  something,  he  must  think  it  possible  that  his  attempt  to  do  it  would 
result  in  something  unpleasant  to  himself;  then  one  has  to  simulate  his  reasoning  process 
to  guess  what  unpleasant  consequences  he  anticipates. 

Depending  on  the  degree  of  sophistication  of  the  knowledge  possessed,  people  may 
explain  the  wolf’s  decision  in  various  ways.  Correspondingly,  depending  on  the  kind  of 
knowledge  provided,  our  inference  machinery  can  produce  various  lines  of  reasoning;  this 
includes  the  following,  relatively  simple  line  of  reasoning 

•  Attacking  a  child  is  extremely  wicked. 

•  Trying  to  eat  a  living  creature  involves  attacking  it,  and  such  an  attack 
is  conspicuous  and  likely  to  be  noticed  by  nearby  people. 

•  Doing  something  extremely  wicked  is  likely  to  bring  severe  punishment, 
if  noticed  by  anyone. 

•  So,  if  the  wolf  tries  to  eat  Little  Red  Riding  Hood,  the  nearby  woodcut¬ 
ters  may  notice  it,  and  he  is  likely  to  be  severely  punished  for  it. 

Or,  the  more  sophisticated  version 
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•  When  a  predatory  animal  eats  a  non-predatory  creature  of  comparable 
size  while  the  creature  is  conscious,  the  predator  attacks  it  as  a  prepa¬ 
ration  for  eating  it. 

•  The  wolf  would  attack  Little  Red  Riding  Hood  before  eating  her. 

•  Attacking  a  person  is  a  conspicuous  action,  and  is  likely  to  be  noticed 
by  nearby  people. 

•  If  people  notice  a  predatory  animal  attacking  a  person,  they  will  most 
probably  want  to  rescue  the  person  from  the  animal. 

•  To  rescue  a  person  from  a  predatory  animal,  one  may  kill  it. 

•  Thus,  the  woodcutters  may  kill  the  wolf. 

Upon  reaching  the  conclusion  that  it  is  possible  that  the  wolf  might  be  killed  or  severely 
punished,  our  inference  machinery  attributes  its  own  ability  to  infer  that  conclusion  to 
the  wolf  (this  is  due  to  our  rules  of  simulative  reasoning).  Then  it  is  easily  explained 
why  the  wolf  decided  against  eating  Little  Red  Riding  Hood  right  then  and  there. 

In  the  following  we  show  in  detail  that  part  of  the  reasoning  process  reaching  the 
conclusion  “The  wolf  may  be  severely  punished.”  The  inferences  we  show  have  actually 
been  generated  by  the  current  implementation  (in  nearly  the  same  order).  The  control 
structure  is  designed  to  systematically  combine  each  new  clause  with  relevant  meaning 
postulates  and  other  general  knowledge.  All  of  the  inferences  are  based  on  the  explicit, 
formalized  rules  of  inference  we  introduced  earlier.  (Simulative  and  narrative  inferences 
are  not  yet  made,  but  are  not  needed  here.)  After  listing  meaning  postulates  and  world 
knowledge,  we  show  the  logical  translation  of  the  story  and  the  reasoning  process. 

Meaning  Postulates 

Ml.  To  walk,  to  attack  someone,  to  try  to  do  something,  to  die,  etc.,  are  types  of 
actions. 

For  n  an  action  predicate: 

□  [(Ka  n)  action-type] 

An  “action  predicate”  is  an  expression  (tt  rj, . . . , rn_ }),  where  7r  is  an  n-adic  atomic 
action  predicate,  n  >  1,  and  Ti,...,rn_i  are  terms. 

M2  (A  meaning  postulate  regarding  actions). 

For  n  an  action  predicate: 

□  (Vx(Ve[[[x  n]  **  e]  <-►  [[x  |  e]  instance-of  (Ka  n)])) 

For  example, 

[[John  eat]  **  El]  <-*[[John  |  El]  instance-of  (Ka  eat)]. 

Note  that  [John  |  El]  is  an  action,  not  just  an  arbitrary  individual-episode 
pair,  so  that  John  is  the  agent  of  that  action. 
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M3.  If  there  is  a  collection  of  things  of  some  type,  then  there  is  a  thing  of  that  type 
which  belongs  to  that  collection  (we  regard  collections  as  non-empty  by  definition). 


For  II  a  monadic  predicate: 

□  (Vx:[x  (coll  II)]  (3y:[y  in  x][y  II])) 

“coll”  is  a  function  that  maps  a  predicate  applicable  to  things  into  a  predicate 
applicable  to  collections  of  things. 


World  Knowledge 

Kl.  For  a  creature  to  attack  a  child  is  extremely  wicked. 

(3x:[x  creature](3y:[y  child](3e[[x  attack  y]  **  e]))) 

—►.9  [[x  |  e]  ((-ly  extreme)  wicked)] 

K2.  Trying  to  eat  any  living  creature  involves  attacking  it. 

(Vx:[[x  alive]  A[x  creature]] 

[(Ka  (try  (Ka  (eat  x))))  involve  (Ka  (attack  x))]) 

K3.  If  one  type  of  action  involves  another,  then  any  creature  doing  an  instance  of  the 
first  will  do  an  instance  of  the  second  during  it. 

(Val:[al  action-type] 

(Va2:[[a2  action-type]  A[al  involve  o2]] 

(Vx:[x  creature] 

(Vel:[[x  |  el]  instance-of  al] 

(3e2:[e2  during  el] 

[[x  |  e2]  instance-of  a2]))))) 

K4.  For  a  sizable  creature  to  attack  a  sizable  thing  is  conspicuous  (relative  to  a  human 
observer). 

(3x:[x  person] 

(3y:[[y  creature]  A->[y  tiny-rel-to  x]] 

(3x:[[x  creature]  A->[z  tiny-rel-to  y]] 

(3e[[y  attack  z\  **  e])))) 

—♦,9  [[y  |  e]  conspicuous-to  x] 

By  contrast,  for  an  ant  to  attack  something  would  not  be  conspicuous  to  a  human. 

K5.  If  a  creature  performs  a  conspicuous  action  within  plain  sight  of  a  person,  that 
person  is  likely  to  notice  that  action. 
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(3x:[x  creature] 

(3 y\y  person] 

(3el:[[x  within-plain-sight-of  y]  **  el] 

(3e2:[e2  during  el]  [[x  |  e2]  conspicuous-to  y])))) 

—*.6  (3e3:[e3  during  e2]  [[y  notice  [x  |  e2]]  **  e3]) 

K6.  Doing  something  extremely  wicked  may  bring  severe  punishment  from  some  group 
of  people,  if  noticed  by  anyone. 

(3x:[x  creature] 

(3el:[[x  |  el]  ((-ly  extreme)  wicked)] 

(3y:[y  person] 

(3e2  [[y  notice  [x  |  el]]  **  e2])))) 

— >  3  (3 z:[z  (coll  person)] 

(3e3:[e2  cause-of  e3] 

[[z  ((-ly  severe)  (punish  x))]  **  e3])) 

K7.  A  human  is  not  tiny  relative  to  a  wolf,  and  vice  versa. 

(Vx:[x  human]  (Vy:[y  wolf] 

[-i[x  tiny-rel-to  y]  A->[y  tiny-rel-to  x]]  )) 

K8.  If  a  creature  is  near  a  person  and  not  tiny  relative  to  the  person,  it  is  probably 
within  plain  sight  of  the  person.  (This  could  be  improved  by  assuming  that  we  are 
dealing  with  a  daytime  episode  in  an  open  setting.) 

(3x:[x  person] 

(3y:[[y  creature]  A->[y  tiny-rel-to  x]] 

(3el  [[y  near  x]  **  el]))) 

-♦.6  (3e2:[e2  same-time  el] 

[[y  within-plain-sight-of  x]  **  e2]) 

K9.  Woodcutters  are  humans. 

(Vx:[x  woodcutter]  [x  human]) 


Story 


Now,  let’s  work  out  the  possible  consequences  if  the  wolf  tries  to  eat  Little  Red  Rid¬ 
ing  Hood.  (We  then  attribute  this  reasoning  to  the  wolf.)  The  relevant  assumptions  and 
story  facts  are  as  follows  (where  we  use  the  convention  of  having  variables  in  lower  case, 
and  constants  in  upper  case): 
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#  The  wolf  tries  to  eat  Little  Red  Riding  Hood. 


(3el:[notn32  during  el]  (The  xl:[xl  wolf][[xl  try  (Ka  (eat  LRRH))]  **  el])) 
By  skolemization  {El/el}  and  reference  determination  {W/xl}: 

51.  [now  during  El] 

52.  [W  wolf] 

53.  [[W  try  (Ka  (eat  LRRH))]  **  El] 


#  Little  Red  Riding  Hood  is  a  girl  and  alive. 

54.  [LRRH  girl] 

55.  [LRRH  alive] 


#  There  are  woodcutters  nearby. 

(3yl:[yl  (coll  woodcutter)] 

(Vx:[x  in  yl]  (3e2:[El  during  e2]  [[W  near  x]  **  e2]))) 

By  Skolemizing  {Cl /yl}: 

56.  [Cl  (coll  woodcutter)] 

57.  (Vx:[x  in  Cl]  (3e2:[El  during  e2]  [[W  near  x]  **  e2])) 

#  Assume  the  following  type-hierarchical  knowledge  is  available  (at  least  indirectly, 
via  a  type  “specialist”): 

58.  [W  creature] 

59.  [LRRH  child] 

S10.  [LRRH  human] 

SI  1 .  [LRRH  creature] 

Reasoning  Process 

-  Note  that  simple  time  inferences  such  as 

[El  during  E2]  A[E2  during  E3]  A[E3  same-time  E4]  b  [El  during  E4] 
will  be  taken  for  granted  during  the  inference  process. 

32  now  will  be  replaced  by  a  term  with  fixed  reference  to  the  time  of  speech. 


44 


-  In  the  following, 

RI  [A;  B]  {Subst  C/v;  Imm-Skol  C’/v’} 

indicates  that  the  subsequent  inference(s)  has  been  made  via  Rule  Instantiation 
of  rule  B  by  premise(s)  A,  with  variable  substitution  C/v,  and  an  existential  vari¬ 
able  v’  in  the  inferred  formula  has  been  immediately  skolemized  as  C\ 

RI  [S5,  SI  1 ;  K2]  {Subst  LRRH/x}: 

1.  [(Ka  (try  (Ka  (eat  LRRH))))  involve  (Ka  (attack  LRRH))] 

“Trying  to  eat  LRRH  involves  attacking  her.” 

RI  [;  Ml]  {Subst  (try  (Ka  (eat  LRRH)))/II}: 

2.  [(Ka  (try  (Ka  (eat  LRRH))))  action-type] 

“Trying  to  eat  LRRH  is  an  action  type.” 

RI  [;  Ml]  {Subst  (attack  LRRH)/II}: 

3.  [(Ka  (attack  LRRH))  action-type] 

“ Attacking  LRRH  is  an  action  type.” 

RI  [S3;  M2]  {Subst  W/x,  El/e,  (try  (Ka  (eat  LRRH)))/II}: 

4.  [[W  |  El]  instance-of  (Ka  (try  (Ka  (eat  LRRH))))] 

“The  wolf’s  trying  to  eat  LRRH  is  an  instance  of  someone’s  trying  to  eat  LRRH.” 

RI  [2,  3,  1,  S8,  4;  K3] 

{Subst  (Ka  (try  (Ka  (eat  LRRH))))/al,  (Ka  (attack  LRRH))/a2,  W/x,  El/el; 
Imm-Skol  E2/e2}: 

5.  [E2  during  El] 

6.  [[W  |  E2]  instance-of  (Ka  (attack  LRRH))] 

RI  [6;  M2]  {Subst  W/x,  E2/e,  (attack  LRRH;/II}: 

7.  [[W  attack  LRRH]  **  E2] 

“The  wolf  attacks  LRRH  .”33 


33  Inferences  5  and  7  could  be  obtained  in  one  step  rather  than  six  by  using  the  following  knowledge 
K2'  instead  of  K2. 

K2'.  When  a  creature  tries  to  eat  a  creature  that  is  alive,  he  attacks  it  during  that  episode. 

( 3x:[z  creature] 

(3y:[[y  alive]  A[y  creature]] 

(3el[[z  try  (K*  (eat  y))]  **  el]))) 

—  (3e2:[e2  during  el]  [z  attack  y]  **  e2]) 

Specifically, 

RI  [S8,  S5,  Sll,  S3,  K2']  {Subst  W/x,  LRRH/y,  El/el;  Imm-Skol  E2/e2}: 

5.  [E2  during  El] 

7.  [[W  attack  LRRH]  **  E2] 
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RI  [S8,  S9,  7;  Kl]  {Subst  W/x,  LRRH/y,  E2/e}: 

8.  [[W  |  E2]  ((-ly  extreme)  wicked)]-9 

#  Up  to  here: 

The  wolf  attacks  LRRH,  and  that’s  extremely  wicked. 

RI  [SIO,  S2;  K7]  {Subst  LRRH/x,  W/y}: 

9.  -i[LRRH  tiny-rel-to  W] 

“LRRH is  not  tiny  relative  to  the  wolf.” 

RI  [S6;  M3]  {Subst  Cl/x,  woodcutter/II;  Imm-Skol  C2/y}: 

10.  [C2  in  Cl] 

11.  [C2  woodcutter] 

“Tl  >  is  a  woodcutter.” 

RI  [11;  K9]  {Subst  C2/x}: 

12.  [C2  human] 

“The  woodcutter  is  a  human.” 

With  type-hierarchical  knowledge,  we  get  from  12: 

13.  [C2  person] 

RI  [12,  S2;  K7]  {Subst  C2/x,  W/y}: 

14.  ->[W  tiny-rel-to  C2] 

“The  wolf  is  not  tiny  relative  to  the  woodcutter.” 

RI  [13,  S8,  14,  Sll,  9,  7;  K4]  {Subst  C2/x,  W/y,  LRRH/z,  E2/e}: 

15.  [[W  |  E2]  conspicuous-to  C2]-9 

#  Up  to  here: 

The  wolf’s  attack  is  conspicuous  to  the  woodcutter. 

RI  [10;  S7]  {Subst  C2/x;  Imm-Skol  E3/e}: 

16.  [El  during  E3] 

17.  [[W  near  C2]  **  E3] 

“The  wolf  is  near  the  woodcutter  (when  he  tries  to  eat  LRRH ).” 

RI  [13,  S8,  14,  17;  K8]  {Subst  C2/x,  W/y,  E3/el;  Imm-Skol  E4/e2}: 

However,  our  aim  is  to  obtain  the  desired  inferences  in  narrative  understanding  from  any  reasonable, 
intuitively  natural  way  of  formulating  the  relevant  world  knowledge.  K2  is  probably  more  natural  than 
K2’,  and  more  importantly,  was  written  down  prior  to  detailed  consideration  of  the  reasoning  process  it 
was  intended  to  support.  If  we  are  going  to  have  a  robust  system  whose  knowledge  base  and  range  of 
understanding  is  easily  expanded,  we  cannot  afford  to  “tailor”  the  syntactic  form  of  the  axioms  to  the 
inference  chains  we  choose  as  examples. 
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18.  [E4  same-time  E3] 

19.  [[W  within-plain-sight-of  C2]  **  E4]'6 

‘'The  wolf  is  likely  to  be  within  plain  sight  of  the  woodcutter.” 

RI  [S8,  13,  19,  (5,  16,  18),  15;  K5]  {Subst  W/x,  C2/y,  E4/el,  E2/e2; 

Imm-Skol  E5/e3}: 

20.  [E5  during  E2] 

21.  [[C2  notice  [W  |  E2]]  **  E5]  324 

#  Up  to  here: 

The  woodcutter  may  notice  the  wolf’s  attacking  LRRH. 

RI  [S8,  8,  13,  21;  K6]  {Subst  W/x,  E2/el,  C2/y,  E5/e2;  Imm-Skol  C3/z,  E6/e3): 

22.  [C3  (coll  person)] 

23.  [E5  cause-of  E6] 

24.  [[C3  ((-ly  severe)  (punish  W))]  **  E6]  087 

#  The  wolf  may  be  severely  punished  by  some  group  of  people. 

This  inference  chain  can  be  extended  to  provide  an  explanation  for  the  wolf’s 
decision  not  to  try  to  eat  LRRH  at  that  point  in  the  story.  First,  rule  K3  would  be 
slightly  augmented  so  as  to  express  the  fact  that  if  one  action  involves  another,  and 
that  other  action  has  certain  consequences,  then  these  are  also  consequences  of  the  first 
action.  Rule  K5  would  be  similarly  augmented  to  make  the  “noticing  episode”  e3  a  causal 
consequence  of  the  episode  e2  (or  action  [x  |  e2])  noticed.  The  “punishing  episode,”  E6, 
in  conclusion  would  then  be  inferred  to  be  a  consequence  of  the  wolf’s  attempt  to  eat 
LRRH .  Given  that  being  severely  punished  is  very  bad,  and  that  agents  generally  refrain 
from  actions  that  they  think  may  have  very  bad  consequences  for  them,  we  would  have 
an  explanation  for  the  wolf’s  restraint.  Note,  however,  that  this  requires  application  of 
the  simulative  inference  rule  (SIM),  i.e.,  we  must  attribute  the  above  inference  chain  to 
the  wolf,  and  draw  further  conclusions  from  this  attribution. 

7  Concluding  Remarks 

Our  logic  is  probably  the  most  expressive  yet  brought  to  bear  on  the  problem  of  narrative 
understanding.  It  makes  implicit  time  and  situation  dependencies  explicit  through  the 
use  of  episodic  variables,  and  admits  unbound  “anaphoric”  variables  and  the  represen¬ 
tation  of  generic  conditionals.  The  expressive  power  of  our  logic  exceeds  that  of  most 
natural  language  systems  (c/.,  Alshawi  and  van  Eijck  (1989)). 

Our  use  of  episodic  variables  owes  a  debt  to  Davidson  (1969),  but  we  can  “attach” 
an  episodic  variable  to  any  formula,  whereas  Davidson’s  method  can  introduce  episodes 
only  for  atomic  formulas.  Thus,  for  Davidson,  there  can  be  no  episodes  involving  quan¬ 
tification,  such  as  an  episode  of  everyone  in  the  room  looking  at  Mary,  and  no  episodes 
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involving  negation ,  such  as  an  episode  of  John  not  eating  anything  for  ten  hours.  Yet 
such  episodes  can  perfectly  well  be  cited  as  causal  antecedents  (e.g.,  sentences  (8)  and 
(9)  in  section  2;  or  “John  did  not  eat  anything  for  ten  hours,  and  as  a  result,  he  was 
famished”),  anaphorically  referred  to,  quantified  over,  etc. 

We  should  also  mention  the  Situation  Calculus  of  McCarthy  and  Hayes  (1969), 
whose  notion  of  a  situation  corresponds  exactly  to  our  notion  of  a  (possible)  moment 
of  time.  As  well,  the  Event  Calculus  of  Kowalski  (1986)  treats  events  as  individuals  as 
we  do,  the  occasions  of  Firby  and  McDermott  (1987)  correspond  to  our  “*,”  and  the 
“'”  operator  of  Hobbs  et  at.  (1986)  is  similar  to  our  “*.”34  Like  Davidson,  however, 
all  of  them  handle  only  atomic  formulas  and  are  unable  to  deal  with  events  involving 
quantification  and  logical  compounds. 

We  have  provided  evidence  that  oar  episode-based  logical  form  can  provide  a  clean 
foundation  for  story  understanding.  The  main  advantages  of  our  approach  are  as  follows: 

(a)  The  representation  of  phrase  structure  is  modular  and  transparent,  as 
is  the  mapping  from  phrase  structure  to  episodic  logic.  The  mapping 
handles  many  combinations  of  tense,  aspect  and  adverbials. 

(b)  Episodic  logic  is  expressively  rich  —  it  allows  the  content  of  most  En¬ 
glish  sentences  and  most  world  knowledge  to  be  represented  in  an  in¬ 
tuitively  comprehensible  and  formally  analyzable  manner.  Restricted 
quantifiers,  modal  operators,  nominalization  operators,  episodic  vari¬ 
ables,  anaphoric  variables,  and  generic  conditionals  are  brought  together 
for  the  first  time  in  a  logic  for  narrative  understanding. 

(c)  Being  probabilistic,  our  rules  of  inference  allow  evidence  for  explanations 
or  predictions  to  be  weighed,  much  as  in  expert  systems. 

(d)  All  types  of  linguistic  and  domain  knowledge  are  strictly  separated  from 
parsing  and  inference  control  structure,  allowing  the  former  to  be  ex¬ 
panded  and  revised  independently  of  the  latter. 

(e)  Hand-simulation  of  the  processing  of  actual  story  fragments,  and  question¬ 
answering,  indicates  that  our  logical  framework  is  epistemologically  ad¬ 
equate  for  story  understanding. 

This  last  claim,  about  epistemological  adequacy,  may  come  as  something  of  a 
surprise.  Whatever  happened  to  scripts,  plans,  TAUs,  TOPs,  MOPs,  etc.?  Are  these 
“higher-level”  knowledge  structures  not  essential  to  story  comprehension?  We  do  not 
doubt  that  they  are.  However,  we  see  no  sharp  divisions  between  any  of  them.  The  more 
focused  the  successive  stages  of  a  script  are  on  an  ultimate  goal,  the  more  it  resembles 
a  plan.  The  more  abstract  its  level  of  description,  the  more  it  resembles  a  TAU  or  a 
TOP,  and  so  on.  Furthermore,  we  see  no  particular  obstacle  to  encoding  all  of  them 
as  axiomatic  knowledge  in  episodic  logic,  in  the  manner  of  the  examples  in  section  2. 

3<The  relationship  between  their  operator  and  our  appears  to  be  *'  =df  AeAx[[x  *■]  *  e]; 
however,  Hobbs  et  al.  provide  no  formal  semantics  for 


48 


For  example,  the  M-BORROW  MOP  (Dyer  1983:  207)  can  be  cast  as  a  set  of  generic 
conditionals  along  the  following  lines.  If  some  person  x  wants  to  have  some  object  y 
temporarily,  which  he  knows  to  be  in  the  possession  of  some  person  z,  he  may  well  ask 
z  to  lend  him  y  and  this  may  induce  z  to  do  so,  fulfilling  x's  goal.  If  some  person  x 
has  some  object  y  on  loan  from  some  person  z ,  then  x  is  obligated  to  return  y  to  z,  and- 
z  will  probably  want  him  to  do  so;  etc.  We  consider  the  taxonomy  of  scripts,  plans, 
MOPs,  etc.,  and  their  elaborate  subcategorization,  more  of  a  potential  guide  to  control 
structure  -  what  knowledge  is  likely  to  be  useful  when  -  than  a  guide  to  representation. 

Much  work  remains  to  be  done  on  our  logic  -  for  instance,  on  the  formal  semantics 
of  nominalization  anu  propositional  attitudes,  and  of  probabilistic  inference.  As  well, 
we  need  to  compose  many  more  rules  of  translation  and  compile  a  substantial  body  of 
knowledge  for  particular  stories.  However,  our  implementation  to  date  has  proved  to 
be  very  gratifying  (ECOLOGIC:  Schubert  et  al.  1989),  and  we  have  incorporated  several 
techniques  into  Ecologic  which  were  developed  for  an  earlier  system  based  on  ordinary 
first-order  logic  (Ecosystem:  de  Haan  &  Schubert  1986,  Miller  et  al.  1987,  Miller  & 
Schubert  1988),  facilitating  efficient  deduction  (both  general  and  specialized)  and  fast, 
selective  access  to  knowledge  relevant  to  a  particular  set  of  concepts  and  topics.  “Nat¬ 
ural”  goal  reduction  remains  to  be  implemented,  but  nonetheless  the  types  of  questions 
handled  by  Econet  are  also  handled  by  EcoLOGIc(e.g.,  “Did  anyone  have  some  cake?” 
or  “Does  grandmother  live  in  a  shoe?”) 
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